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(57) Abstract: A method of modifying a test NR polypeptide is disclosed. The method can include: providing a test NR polypeptide 
sequence having a characteristic that is targeted for modification; aligning the test NR polypeptide sequence with at least one refer- 
ence NR polypeptide sequence for which an X-ray structure is available, wherein the at least one reference NR polypeptide sequence 
has a characteristic that is desired for the test NR polypeptide; building a three-dimensional model for the test NR polypeptide using 
the three-dimensional coordinates of the X-ray structure(s) of the at least one reference polypeptide and its sequence alignment with 
the test NR polypeptide sequence; examining the three-dimensional model of the test NR polypeptide for differences with the at least 
one reference polypeptide that are associated with the desired characteristic; and mutating at least one amino acid residue in the test 
NR polypeptide sequence located at a difference identified above to a residue associated with the desired characteristic, whereby 
the test NR polypeptide is modified. An isolated GR polypeptide comprising a mutation in a ligand binding domain, wherein the 
mutation alters the solubility of the ligand binding domain, is also disclosed. An isolated GR polypeptide, or functional portion 
thereof, having one or more mutations comprising a substitution of a hydrophobic amino acid residue by a hydrophilic amino acid 
residue is also disclosed. Representative mutations are F602S and F602D substitutions. Expression of the GR polypeptide in E. coli 
is also provided. A solved three-dimensional crystal structure of a glucocorticord receptor a ligand binding domain polypeptide is 
also disclosed, along with a crystalline form of the glucocorticord receptor a ligand binding domain polypeptide. Methods of design- 
ing modulators of the biological activity of glucocorticoid receptor a and other nuclear receptor, steroid receptor and glucocorticorid 
receptor polypeptides and nuclear receptor, steroid receptor and glucocorticorid receptor ligand binding domain polypeptides are 
also disclosed. 
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CRYSTALLIZED GLUCOCORTICOID RECEPTOR LIGAND BINDING DOMAIN 
POLYPEPTIDE AND SCREENING METHODS EMPLOYING SAME 

Cross Refere nce to Related Applications 
The present patent application is based on and claims priority to U S 
Provisional Application Serial No. 60/305,902, entitled "CRYSTALLIZED 
GLUCOCORTICOID RECEPTOR LIGAND BINDING DOMAIN POLYPEPTIDE 
AND SCREENING METHODS EMPLOYING SAME", which was filed July 17, 
2001 and is incorporated herein by reference in its entirety. 



Technical Field 

The present invention relates generally to a modified glucocortcoid receptor 
polypeptide, to a modified glucocortcoid receptor ligand binding domain 
polypeptide, to the structure of a glucocorticoid receptor ligand binding domain 
and to the structure of a glucocorticoid receptor ligand binding domain in complex 
with a ligand and a co-activator. The invention further relates to methods by 
which a soluble glucocorticoid polypeptide can be generated and by which 
modulators and ligands of nuclear receptors, particularly steroid receptors and 
more particularly glucosteroid receptors and the ligand binding domains thereof 
20 can be identified. 
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ATP 

ADP 

AR 

CAT 

CBP 

cDNA 

DBD 

DMSO 

DNA 

DTT 

EDTA 



Abbreviations 
adenosine triphosphate 
adenosine diphosphate 
androgen receptor 
chloramphenicol acyltransferase 
CREB binding protein 
complementary DNA 
DNA binding domain 
dimethyl sulfoxide 
deoxyribonucleic acid 
dithiothreitol 

ethylenediaminetetraacetic acid 
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ER 

GR 

GRE 

GST 

HEPES 

HSP 

kDa 

LBD 

MR 

NDP 

NID 

NTP 

PAGE 

PCR 

Pi 

PPAR 

PR 

RAR 

RXR 

SDS 

SDS-PAGE 

TIF2 

TR 

VDR 
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estrogen receptor 
glucocorticoid receptor 
glucocorticoid responsive element 
glutathione S-transferase 

N-2-Hydroxyethylpiperazine-N'-2-ethanesulfonic acid 

heat shock protein 

kilodalton(s) 

ligand binding domain 

mineralcorticoid receptor 

nucleotide diphosphate 

nuclear receptor interaction domain 

nucleotide triphosphate 

polyacrylamide gel electrophoresis 

polymerase chain reaction 

isoelectric point 

peroxisome proliferator-activated receptor 
progesterone receptor 
retinoid acid receptor 
retinoid X receptor 
sodium dodecyl sulfate 

sodium dodecyl sulfate polyacrylamide gel 
electrophoresis 

transcription intermediary factor 2 
thyroid receptor 
vitamin D receptor 



Amino Acid Abbreviations 



30 



Single-Letter Code 
A 
V 
L 
I 



Three-Letter Code 
Ala 
Val 
Leu 
lie 
Pro 



Name 

Alanine 

Valine 

Leucine 

Isoleucine 

Proline 
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V 

Phenylalanine 

Tryptophan 

Methionine 

Glycine 

Serine 

Threonine 

Cysteine 

Tyrosine 

Asparagine 

Glutamine 

Aspartic Acid 

Glutamic Acid 

Lysine 

Arginine 

Histidine 
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Amino Acid 






Codons 


Alanine 


Ala 


A 


GCA GCC GCG GCU 


Cysteine 


Cys 


C 


UGC UGU 


Aspartic Acid 


Asp 


D 


GAC GAU 


Glumatic acid 


Glu 


E 


GAAGAG 


Phenylalanine 


Phe 


F 


UUC UUU 


Glycine 


Gly 


G 


GGA GGC GGG GGU 


Histidine 


His 


H 


CAC CAU 


Isoleucine 


lie 


1 


AUA AUC AUU 


Lysine 


Lys 


K 


AAAAAG 


Methionine 


Met 


M 


AUG 


Asparagine 


Asn 


N 


AAC AAU 


Proline 


Pro 


P 


CCA CCC CCG CCU 


Glutamine 


Gin 


Q 


CAACAG 


Threonine 


Thr 


T 


ACA ACC ACG ACU 


Valine 


Val 


V 


GUA GUC GUG GUU 
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F 


Phe 


W 


Trp 


ft A 

M 


Met 


G 


Gly 


S 


Ser 


T 


Thr 


C 


Cys 


Y 


Tyr 


N 


Asn 


Q 


Gin 


D 


Asp 


E 


Glu 


K 


Lys 


R 


Arg 


H 


His 
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Tryptophan Trp W UGG 

Tyrosine Tyr Y UAC UAU 

Leucine Leu L UUA UUG CUA CUC 

CUG CUU 

Arginine Arg R AGAAGGCGACGC 

CGG CGU 

Serine Ser S ACG AGU UCA UCC 

UCG UCU 



10 Background Art 

Nuclear receptors reside in either the cytoplasm or nucleus of eukaryotic 
cells and represent a superfamily of proteins that specifically bind a physiologically 
relevant email molecule, such as a hormone or vitamin. As a result of a molecule 
binding to a nuclear receptor, the nuclear receptor changes the ability of a cell to 

15 transcribe DNA, i.e. nuclear receptors modulate the transcription of DNA. 
However, they can also have transcription independent actions. 

Unlike integral membrane receptors and membrane-associated receptors, 
nuclear receptors reside in either the cytoplasm or nucleus of eukaryotic cells. 
Thus, nuclear receptors comprise a class of intracellular, soluble, ligand-regulated 

20 transcription factors. Nuclear receptors include but are not limited to receptors for 
androgens, mineralcorticoids, progestins, estrogens, thyroid hormones, vitamin D, 
retinoids, eicosanoids, peroxisome proliferators and, pertinently, glucocorticoids. 
Many nuclear receptors, identified by either sequence homology to known 
receptors (See, e.g. , Drewes et al„ (1996) Mol. Cell. Biol. 16:925-31) or based on 

25 their affinity for specific DNA binding sites in gene promoters (See, e.g. , Sladek et 
aL, Genes Dev. 4:2353-65), have unascertained ligands and are therefore 
commonly termed "orphan receptors". 

Glucocorticoids are an example of a cellular molecule that has been 
associated with cellular proliferation. Glucocorticoids are known to induce growth 

30 arrest in the G1 -phase of the cell cycle in a variety of cells, both in vivo and in 
vitro, and have been shown to be useful in the treatment of certain cancers. The 
glucocorticoid receptor (GR) belongs to an important class of transcription factors 
that alter the expression of target genes in response to a specific hormone signal. 
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Accumulated evidence indicates that receptor associated proteins play key roles 
•n regulating glucocorticoid signaling. The list of cellular proteins that can bind and 
co-purify with the GR is constantly expanding. 

Glucocorticoids are also used for their anti-inflammatory effect on the skin 
jo.nts, and tendons. They are important for treatment of disorders where 
inflammation is thought to be caused by immune system activity. Representative 
d.sorders of this sort include but are not limited to rheumatoid arthritis 
.nflammatory bowel disease, glomerulonephritis, and connective tissue diseases 
l.ke systemic lupus erythmatosus. Glucocorticoids are also used to treat asthma 
and are widely used with other drugs to prevent the rejection of organ transplants 
Some cancers of the blood (leukemias) and lymphatic system (lymphomas) can 
also respond to corticosteroid drugs. 

Glucocorticoids exert several effects in tissues that express receptors for 
them. They regulate the expression of several genes either positively or 
negatively and in a direct or indirect manner. They are also known to arrest the 
growth of certain lymphoid cells and in some cases cause cell death (Harmon et 
aL, (1979) J. Cell Physiol. 98: 267-278; Yamamoto . (1985) Ann. Rev. Genet 19" 
209-252; Evans, (1988) Science 240:889-895; Beato, (1989) Cell 56-335-344- 
Ihpmpson, (1989) Cancer Res. 49: 2259s-2265s.). Due in part to their ability to 
k.ll cells, glucocorticoids have been used for decades in the treatment of 
leukemias, lymphomas, breast cancer, solid tumors and other diseases involving 
•rregular cell growth, e.g. psoriasis. The inclusion of glucocorticoids in 
chemotherapeutic regimens has contributed to a high rate of cure of certain 
leukemias and lymphomas which were formerly lethal (Homo-Delarche (1984) 
Cancer Res. 44: 431-437). Although it is clear that glucocorticoids exert these 
effects after binding to their receptors, the mechanism of cell kill is not completely 
understood, although several hypotheses have been proposed. Among the more 
promment hypotheses are: the deduction of critical lymphokines, oncogenes and 
growth factors; the induction of supposed "lysis genes"; alterations in calcium ion 
influx; the induction of endonucleases; and the induction of a cyclic AMP- 
dependent protein kinase (McConkev et al., (1989) Arch. Biochem. Biophys 269- 
365-370; Cohen & _puke, (1984) ^ lmmunoj ^ ^ Eastman _ ReRs 

Vedeckis, (1986) Cancer Res. 46: 2457-2462; Kelso & Munck . (1~9^J ,mmunol 
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133:784-791; Gruol et al. , (1989) Molec. Endocrinol. 3: 2119-2127; Yuh & 
Thompson , (1989) J. Biol. Chem. 264: 10904-10910). 

Polypeptides, including the glucocorticoid receptor ligand binding domain, 
have a three-dimensional structure determined by the primary amino acid 
5 sequence and the environment surrounding the polypeptide. This three- 
dimensional structure establishes the polypeptide's activity, stability, binding 
affinity, binding specificity, and other biochemical attributes. Thus, knowledge of a 
protein's three-dimensional structure can provide much guidance in designing 
agents that mimic, inhibit, or improve its biological activity. 

10 The three-dimensional structure of a polypeptide can be determined in a 

number of ways. Many of the most precise methods employ X-ray crystallography 
(See, e.g. , Van Holde , (1971) Physical Biochemistry , Prentice-Hall, New Jersey, 
pp. 221-39). This technique relies on the ability of crystalline lattices to diffract X- 
rays or other forms of radiation. Diffraction experiments suitable for determining 

15 the three-dimensional structure of macromolecules typically require high-quality 
crystals. Unfortunately, such crystals have been unavailable for the ligand binding 
domain of a human glucocorticoid receptor, as well as many other proteins of 
interest. Thus, high-quality diffracting crystals of the ligand binding domain of a 
human glucocorticoid receptor in complex with a ligand and a peptide would 

20 greatly assist in the elucidation of its three-dimensional structure. 

Clearly, the solved crystal structure of the ligand binding domain of a 
glucocorticoid receptor polypeptide would be useful in the design of modulators of 
activity mediated by the glucocorticoid receptor. Evaluation of the available 
sequence data shows that GRa is particularly similar to MR, PR and AR. The 

25 GRa LBD has approximately 56%, 54% and 50% sequence identity to the MR, PR 
and AR LBDs, respectively. The GRB amino acid sequence is identical to the 
GRa amino acid sequence for residues 1-727, but the remaining 15 residues in 
GRB show no significant similarity to the remaining 50 residues in GRa. If no X- 
ray structure were available for GRa, then one could build a model for GRa using 

30 the available X-ray structures of PR and/or AR as templates. These theoretical 
models have some utility, but cannot be as accurate as a true X-ray structure, 
such as the X-ray structure disclosed here. Because of their limited accuracy, a 
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mode! for GRa will generally be less useful than an X-ray structure for the design 
of agonists, antagonists and modulators of GRa. 

The salved GRa-ligand-co-activatar crystal structure would provide 
structural details and insights necessary ta design a modulatar of GRa that 
max,m, 2e s preferred requirements for any modulator, i.e. patency and specificity 
By exploiting the structural details obtained from a GRa-ligand-eo-activator crystal 
structure, it would be possible to design a GRa modulator that, despite GRa's 
Similarity with other steroid receptors and nuclear receptors, exploits the unique 
statural features of the ligand binding domain of human GRa. A GRa modulatar 
developed using structure-assisted design would take advantage of heretofore 
unknown GRa structural considerations and thus be more effective than a 
modulator developed using homology-based design. Potential or existent 
homology models cannot provide the necessary degree of specificity. A GRa 
modulator designed using the structural coordinates of a crystalline form of the 
ligand binding domain of GRa in complex with a ligand and a co-activator would 
also prov.de a starting point for the development of modulators of other nuclear 
receptors. 

Although several journal articles have referred to GR mutants having 
.ncreased ligand efficacy" in cell-based assays, it has not been mentioned that 
20 such mutants could have improved solution properties so that they could provide a 
smtable reagent for purification, assay, and crystallization. See Garabedian & 
Yamamoto (1992) Mol. Bio,. Cell 3: 1245-1257; Kralli, et al., (1995) Proc Natl 
Acad. Sci. 92: 4701-4705; Bohen (1 995) J. Bio,. Chem. 270: 29433-29438- Bohen 
(1998) Mo,. Cell. Bio,. 18: 3330-3339; Freeman et al., (2000) Genes Dev 14- 
25 422-434. 

Indeed, it is well documented that GR associates with molecular 
chaperones (such as hsp90, hsc70, and p23). In the past, it has been considered 
that GR would either not be active or soluble if purified away from these binding 
partners. In fact, it has even been mentioned that GR must be in complex with 
30 hs P 90 in order to adopt a high affinity steroid binding conformation. See Xu et al 
(1998) J. Biol. Chem. 273: 13918-13924; Rajapandi et al. (2000) J. Biol. Chem 
275: 22597-22604. 



15 
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Still other journal articles have reported E.coli expression of GST-GR, but 
also noted a failure to purify the purported polypeptide. See Ohara-Nemoto et al., 
(1990) J. Steroid Biochem. Molec. Biol. 37: 481-490; Caamano et al., (1994) 
Annal. NY Acad. Sci. 746: 68-77. 
5 What is needed, therefore, is a purified, soluble GRa LBD polypeptide for 

use in structural studies, as well as methods for making the same. Such methods 
would also find application in the preparation of modified NRs in general. 

What is also needed is a crystallized form of a GRa ligand binding domain, 
preferably in complex with a ligand and more preferably in complex with a ligand 

10 and a co-activator. Acquisition of crystals of the GRa ligand binding domain 
polypeptide permits the three-dimensional structure of a GRa ligand binding 
domain (LBD) polypeptide to be determined. Knowledge of the three dimensional 
structure can facilitate the design of modulators of GR-mediated activity. Such 
modulators can lead to therapeutic compounds to treat a wide range of conditions, 

15 including inflammation, tissue rejection, auto-immunity, malignancies such as 
leukemias and lymphomas, Cushing's syndrome, acute adrenal insufficiency, 
congenital adrenal hyperplasia, rheumatic fever, polyarteritis nodosa, 
granulomatous polyarteritis, inhibition of myeloid cell lines, immune 
proliferation/apoptosis, HPA axis suppression and regulation, hypercortisolemia, 

20 modulation of the TH1/TH2 cytokine balance, chronic kidney disease, stroke and 
spinal cord injury, hypercalcemia, hypergylcemia, acute adrenal insufficiency, 
chronic primary adrenal insufficiency, secondary adrenal insufficiency, congenital 
adrenal hyperplasia, cerebral edema, thrombocytopenia, Little's syndrome, 
inflammatory bowel disease, systemic lupus erythematosus, polyartitis nodosa, 

25 Wegener's granulomatosis, giant cell arteritis, rheumatoid arthritis, osteoarthritis, 
hay fever, allergic rhinitis, urticaria, angioneurotic edema, chronic obstructive 
pulmonary disease, asthma, tendonitis, bursitis, Crohn's disease, ulcerative colitis, 
autoimmune chronic active hepatitis, organ transplantation, hepatitis, cirrhosis, 
inflammatory scalp alopecia, panniculitis, psoriasis, discoid lupus erythematosus, 

30 inflamed cysts, atopic dermatitis, pyoderma gangrenosum, pemphigus vulgaris, 
bullous pemphigoid, systemic lupus erythematosus, dermatomyositis, herpes 
gestationis, eosinophilic fasciitis, relapsing polychondritis, inflammatory vasculitis, 
sarcoidosis, Sweet's disease, type 1 reactive leprosy, capillary hemangiomas, 
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contact dermatitis, atopic dermatitis, lichen planus, exfoliative dermatitus 
erythema nodosum, acne, hirsutism, toxic epidermal necrolysis, erythema 
multiform, cutaneous T-ce.l lymphoma. Other applications of a GR modu.ator 
developed in accordance with the present invention can be employed to treat 
Human Immunodeficiency Virus (HIV), cell apoptosis, and can be employed in 
treating cancerous conditions including, but not limited to, Kaposi's sarcoma 
•mmune system activation and modulation, desensitization of inflammatory 
responses, IL-1 expression, natural killer cell development, lymphocytic leukemia 
treatment of retinitis pigmentosa. Other applications for such a modulator 
compnse modulating cognitive performance, memory and learning enhancement 
depression, addiction, mood disorders, chronic fatigue syndrome, schizophrenia 
stroke, sleep d.sorders, anxiety, immunostimulants, repressors, wound healing 
and a role as a tissue repair agent or in anti-retro viral therapy. 

Summary of the Invention 
A method of modifying a test NR polypeptide is disclosed. The method can 
compnse: providing a test NR polypeptide sequence having a characteristic that 
■s targeted for modification; aligning the test NR polypeptide sequence with at 
least one reference NR polypeptide sequence for which an X-ray structure is 
available, wherein the at least one reference NR polypeptide sequence has a 
characteristic that is desired for the test NR polypeptide; building a three- 
d,mensiona» model for the test NR polypeptide using the three-dimensional 
coordinates of the X-ray structure(s) of the at least one reference polypeptide and 
•ts sequence alignment with the test NR polypeptide sequence; examining the 
three-d,mensional model of the test NR polypeptide for differences with the at 
least one reference polypeptide that are associated with the desired characteristic- 
and mutating at least one amino acid residue in the test NR polypeptide sequence 
located at a difference identified above to a residue associated with the desired 
characteristic, whereby the test NR polypeptide is modified. 

A method of altering the solubility of a test NR polypeptide is also disclosed 
.n accordance with the present invention. In a preferred embodiment, the method 
composes: (a) providing a reference NR polypeptide sequence and a test NR 
polypeptide sequence; (b) comparing the reference NR polypeptide sequence and 
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the test NR polypeptide sequence to identify one or more residues in the test NR 
sequence that are more or less hydrophilic than a corresponding residue in the 
reference NR polypeptide sequence; and (c) mutating the residue in the test NR 
polypeptide sequence identified in step (b) to a residue having a different 
5 hydrophilicity, whereby the solubility of the test NR polypeptide is altered. 
Optionally, the reference NR polypeptide sequence is an AR or a PR sequence, 
and the test polypeptide sequence is a GR polypeptide sequence. Alternatively, 
the reference polypeptide sequence is a crystalline GR LBD. The comparing of 
step (b) is preferably by sequence alignment. 

10 An isolated GR polypeptide comprising a mutation in a ligand binding 

domain, wherein the mutation alters the solubility of the ligand binding domain, is 
also disclosed. An isolated GR polypeptide, or functional portion thereof, having 
one or more mutations comprising a substitution of a hydrophobic amino acid 
residue by a hydrophilic amino acid residue in a ligand binding domain is also 

15 disclosed. Preferably, in each case, the mutation can be at a residue selected 
from the group consisting of V552, W557, F602, L636, Y648, W712, L741, L535, 
V538, C638, M691, V702, Y648, Y660, L685, M691, V702, W712, L733, Y764 
and combinations thereof. More preferably, the mutation is selected from the 
group consisting of V552K, W557S, F602S, F602D, F602E, L636E, Y648Q, 

20 W712S, L741 R, L535T, V538S, C638S, M691T, V702T, W712T and combinations 
thereof. Antibodies against such polypeptides are also disclosed, as are methods 
of detecting such polypeptides and methods of identifying substances that 
modulate the biological activity of such polypeptides. 

An isolated nucleic acid molecule encoding a GR polypeptide comprising a 

25 mutation in a ligand binding domain, wherein the mutation alters the solubility of 
the ligand binding domain, or encoding a GR LBD polypeptide, or functional 
portion thereof, having one or more mutations comprising a substitution of a 
hydrophobic amino acid residue by a hydrophilic amino acid residue, is also 
disclosed. A chimeric gene, comprising the nucleic acid molecule operably linked 

30 to a heterologous promoter, a vector comprising the chimeric gene, and a host cell 
comprising the chimeric gene are also disclosed. Methods for detecting such a 
nucleic acid molecule are also disclosed. 
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A substantially pure GRcc ligand binding domain polypeptide in crystalline 
form is disclosed. Preferably, the crystalline form has lattice constants of a = b = 
126.014 A, c = 86.312 A, a = 9Qo, B = 90 o, y = 120 o. Preferab|y ^ crysta||ine 
form is a hexagonal crystalline form. More preferably, the crystalline form has a 
5 space group of P6, Even more preferably, the GRcc ligand binding domain 
polypeptide has the F602S amino acid sequence shown in Example 2 Even 
more preferably, the GRa ligand binding domain has a crystalline structure further 
characterized by the coordinates corresponding to Table 4. 

Preferably, the GRa ligand binding domain polypeptide is in complex with a 
10 ligand. Optionally, the crystalline form contains two GRa ligand binding domain 
polypeptides in the asymmetric unit. Preferably, the crystalline form is such that 
the three-dimensional structure of the crystallized GRa ligand binding domain 
polypeptide can be determined to a resolution of about 2.8 A or better. Even more 
preferably, the crystalline form contains one or more atoms having a molecular 
1 5 weight of 40 grams/mol or greater. 

A method for determining the three-dimensional structure of a crystallized 
GR ligand binding domain polypeptide to a resolution of about 2.8 A or better the 
method comprising:(a) crystallizing a GR ligand binding domain polypeptide- and 
(b) analyzing the GR ligand binding domain polypeptide to determine the three- 
d.mensional structure of the crystallized GR ligand binding domain polypeptide, 
whereby the three-dimensional structure of a crystallized GR ligand binding 
domain polypeptide is determined to a resolution of about 2.8 A or better 
Preferably, the analyzing is by X-ray diffraction. More preferably, the 
crystallization is accomplished by the hanging drop method, and wherein the GRa 
25 ligand binding domain is mixed with a reservoir. 

A method of generating a crystallized GR ligand binding domain 
polypeptide, the method comprising:(a) incubating a solution comprising a GR 
ligand binding domain with a reservoir; and (b) crystallizing the GR ligand binding 
domain polypeptide using the hanging drop method, whereby a crystallized GR 
30 ligand binding domain polypeptide is generated. 

A method of designing a modulator of a nuclear receptor, the method 
comprising: (a) designing a potential modulator of a nuclear receptor that will 
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make interactions with amino acids in the ligand binding site of the nuclear 
receptor based upon the atomic structure coordinates of a GR ligand binding 
domain polypeptide; (b) synthesizing the modulator; and (c) determining whether 
the potential modulator modulates the activity of the nuclear receptor, whereby a 
5 modulator of a nuclear receptor is designed. 

A method of designing a modulator that selectively modulates the activity of 
a GRa polypeptide the method comprising: (a) obtaining a crystalline form of a 
GRa ligand binding domain polypeptide; (b) determining the three-dimensional 
structure of the crystalline form of the GRa ligand binding domain polypeptide; 

10 and (c) synthesizing a modulator based on the three-dimensional structure of the 
crystalline form of the GRa ligand binding domain polypeptide, whereby a 
modulator that selectively modulates the activity of a GRa polypeptide is 
designed. Preferably, the method further comprises contacting a GRa ligand 
binding domain polypeptide with the potential modulator; and assaying the GRa 

15 ligand binding domain polypeptide for binding of the potential modulator, for a 
change in activity of the GRa ligand binding domain polypeptide, or both. More 
preferably, the crystalline form is in orthorhombic form. Even more preferably, the 
crystals are such that the three-dimensional structure of the crystallized GRa 
ligand binding domain polypeptide can be determined to a resolution of about 2.8 

20 A or better. 

A method of screening a plurality of compounds for a modulator of a GR 
ligand binding domain polypeptide, the method comprising: (a) providing a library 
of test samples; (b) contacting a GR ligand binding domain polypeptide with each 
test sample; (c) detecting an interaction between a test sample and the GR ligand 

25 binding domain polypeptide; (d) identifying a test sample that interacts with the 
GR ligand binding domain polypeptide; and (e) isolating a test sample that 
interacts with the GR ligand binding domain polypeptide, whereby a plurality of 
compounds is screened for a modulator of a GR ligand binding domain 
polypeptide. Preferably, the test samples are bound to a substrate, and more 

30 preferably, the test samples are synthesized directly on a substrate. The GR 
ligand binding domain polypeptide can be in soluble or crystalline form. 

A method for identifying a GR modulator is also disclosed. In a preferred 
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embodiment, the method comprises: (a) providing atomic coordinates of a GR 
hgand binding domain to a computerized modeling system; and (b) modeling 
Uganda that fit spatially into the binding pocket of the GR .igand binding domain to 
thereby identify a GR modulator, whereby a GR modulator is identified 
Preferably, the method further comprises identifying in an assay for GR-mediated 
actmty a modeled Hgand that increases or decreases the activity of the GR. 

A method of identifying modulator that selectively modulates the activity of 
a GRa polypeptide compared to other GR polypeptides, the method comprising- 
(a) providing atomic coordinates of a GRa ligand binding domain to a 
computerized modeling system; and (b) modeling a ligand that fits into the binding 
pocket of a GRa ligand binding domain and that interacts with conformational^ 
constrained residues of a GRa conserved among GR subtypes, whereby a 
modulator that selectively modulates the activity of a GRa polypeptide compared 
to other polypeptides is identified. Preferably, the method further comprises 
■dentrfymg in a biological assay for GRa activity a modeled ligand that selectively 
binds to GRa and increases or decreases the activity of said GRa. 

A method of designing a modulator of a GR polypeptide, the method 
compns.ng: (a) selecting a candidate GR ligand; (b) determining which amino acid 
or ammo acids of a GR polypeptide interact with the ligand using a three- 
d.mensional model of a crystallized protein comprising a GRa LBD; (c) identifying 
m a biological assay for GR activity a degree to which the ligand modulates the 
act.v,ty of the GR polypeptide; (d) selecting a chemical modification of the ligand 
wherein the interaction between the amino acids of the GR polypeptide and the 
hgand is predicted to be modulated by the chemical modification; (e) synthesizing 
a chem,cal compound with the selected chemical modification to form a modified 
ligand; (f) contacting the modified ligand with the GR polypeptide; (g) identifying in 
a b,olog IC al assay for GR activity a degree to which the modified ligand modulates 
the b.ological activity of the GR polypeptide; and (h) comparing the biological 
activity of the GR polypeptide in the presence of modified ligand with the biological 
act.v,ty of the GR polypeptide in the presence of the unmodified ligand, whereby a 
modulator of a GR polypeptide is designed. Preferably, the GR polypeptide is a 
GRa polypeptide. More preferably, the three-dimensional model of a crystallized 
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protein is a GRa LBD polypeptide with a bound ligand. Optionally, the method 
further comprises repeating steps (a) through (f), if the biological activity of the GR 
polypeptide in the presence of the modified ligand varies from the biological 
activity of the GR polypeptide in the presence of the unmodified ligand. 
5 An assay method for identifying a compound that inhibits binding of a 

ligand to a GR polypeptide, the assay method comprising:(a) designing a test 
inhibitor compound based on the three dimensional atomic coordinates of GR; (b) 
incubating a GR polypeptide with a ligand in the presence of a test inhibitor 
compound; (c) determining an amount of ligand that is bound to the GR 

10 polypeptide, wherein decreased binding of ligand to the GR protein in the 
presence of the test inhibitor compound relative to binding of ligand in the 
absence of the test inhibitor compound is indicative of inhibition; and (d) 
identifying the test compound as an inhibitor of ligand binding if decreased ligand 
binding is observed, whereby a compound that inhibits binding of a ligand to a GR 

15 polypeptide is identified. 

A method of identifying a NR modulator that selectively modulates the 
biological activity of one NR compared to GRa is also disclosed. The method 
comprises: (a) providing an atomic structure coordinate set describing a GRa 
ligand binding domain structure and at least one other atomic structure coordinate 

20 set describing a NR ligand binding domain, each ligand binding domain 
comprising a ligand binding site; (b) comparing the atomic structure coordinate 
sets to identify at least one diference between the sets; (c) designing a candidate 
ligand predicted to interact with the difference of step (b); (d) synthesizing the 
candidate ligand; and (e) testing the synthesized candidate ligand for an ability to 

25 selectively modulate a NR as compared to GRa, whereby a NR modulator that 
selectively modulates the biological activity NR compared to GRa is identified. 

Accordingly, it is an object of the present invention to provide a three 
dimensional structure of the ligand binding domain of a GR. The object is 
achieved in whole or in part by the present invention. 

30 An object of the invention having been stated hereinabove, other objects 

will be evident as the description proceeds, when taken in connection with the 
accompanying Drawings and Laboratory Examples as best described 
hereinbelow. 
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Brief Description of the Drawings 
Figure 1A depicts E. coli expression of mutant 6xHisGST-GR(521-777) 
F602S (SEQ ID NO:12) via SDS-PAGE. Staining was accomplished using the 
commercially available PROBLUE product. 
5 Figure 1B depicts E coli expression of mutant 6xHisGST-GR(52 1-777) 

F602D (SEQ ID NO:14) via SDS-PAGE. Staining was accomplished using the 
commercially available PROBLUE product. 

Figure 1C depicts purification of E. coli expressed GR(521-777)F602S 
(SEQ ID NO:12) via SDS-PAGE. Staining was accomplished using the 
10 commercially available PROBLUE product. 

Figure 1 D shows the partial purification of E. Coli expressed GR (521-777) 
for several mutants isolated by the Lad Fusion system. 

Figures 2A-2C depict characterization of GR binding to dexamethasone 
and the TIF2 LXXLL (SEQ ID NO:18) motif. 

Figure 2A is a graph depicting the binding of 10 nM fluorescein 
dexamethasone to varied concentrations of GST-GR LBD (F602S) 521-777 
(circles), GR LBD (F602S) 521-777 (triangles) and GR LBD (F602S) 521-777 in 
the presence of 100 uM unlabeled dexamethasone (squares) as measured by 
fluorescence polarization. 

Figure 2B is a graph depicting ligand-dependent binding of TIF2 
LXXLL(SEQ ID NO:18) motif to GR LBD. The binding of varied concentrations of 
GST-GR LBD (F602S) 521-777 to immobilized TIF2 732-756 peptide (SEQ ID 
NO: 17) in the presence of a five-fold excess of dexamethasone (triangles), RU486 
(squares) and no compound (circles) was measured by surface plasmon 
25 resonance. Each point is the average of two determinations. 

Figure 2C is a graph depicting that TIF2 coactivator peptide enhances 
stability of GR dexamethasone binding activity. The effect of 25 uM coactivator 
peptide TIF2 732-756 (diamonds) or no peptide (squares) on the binding of GST- 
GR LBD (F602S) 521-777 to 10 nM fluorescein dexamethasone with time is 
30 determined by fluorescence polarization. 

Figure 3A is a worm/ribbon diagram depicting the overall arrangement of 
the GR LBD diamers. Two GR LBDs are shown in white and gray worm 



20 



MSOOCIft <WO_0301S692A2.I_» 



WO 03/015692 PCT/US02/22648 

-16- 

representation. TIF2 peptides are shown in gray ribbon and the two 
dexamethasone ligands are shown in space filling. 

Figure 3B is a worm/ribbon diagram depicting one orientation of the 
GR/TIF2/Dex complex. TIF peptide is shown in ribbon and GR is shown in worm. 
5 The AF2 helix of the GR is shown in gray worm. The key structural elements are 
marked and are described herein below. 

Figure 3C is a worm/ribbon diagram depicting a second orientation of the 
GR/TIF2/DEX complex. TIF2 peptide is shown in ribbon and GR is shown in 
worm. The AF2 helix of GR is shown in gray worm. The key structural elements 

10 are marked and are described herein below. 

Figures 4A and 4B depict the overlap of the GR LBD with the AR LBQ 
(Figure 4A) and the PR LBD (Figure 4B). The GR is in thick line. AR and PR are 
in the thin line. Only the backbone C alpha atoms are shown. 

Figure 5 is a sequence alignment of steroid receptors, particularly an 

15 alignment of the F602S GRa sequence (SEQ ID NO:31) with MR(SEQ ID NO:26), 
PR(SEQ ID NO:27), AR(SEQ ID NO:28), ERa(SEQ ID NO:29), and ERP(SEQ ID 
NO:30). Residues that lie within 5.0 angstroms of the ligand are identified with 
small square boxes around the one-letter amino acid code. The ligands used for 
this calculation are dexamethasone (for GR), progesterone (for PR), 

20 dihydrotestosterone (for AR), estradiol (for ERa) and genistein (for ERp). The 
alpha-helices and beta-strands observed in the X-ray structures are identified by 
the larger boxes and captions. Note that the secondary structure of MR is not 
publicly known at this time, and thus is not annotated in the Figure. More than 
one structure is available for PR, AR, ERa and ERp, and, in some cases, the 

25 alpha-helices have different endpoints in these different X-ray structures. The 
variation in the alpha-helices is indicated here by using boxes with thicker and 
thinner linewidths, where the thicker linewidth box encompasses residues that 
adopt the same secondary structure in all available X-ray structures, and thinner 
linewidth boxes encompass residues that adopt an alpha-helical structure in some 

30 but not all X-ray structures. The secondary structures were determined by 
graphical examination of the X-ray structures. 

Figure 6A depicts the GR ligand binding pocket The GR LBD is shown in 
a worm representation and the pocket is shown with a white surface. 



S DOC ID: <WO_0301 5692A2J_> 



10 



15 



25 



30 



WO 03/015692 

PCT/US02/22648 

-17- 

Figure 6B is a diagram that depicts surfaces at the GR-dexamethasone 
interface. The electron density is calculated with Fo coefficiency and shown at a 
one sigma cutoff. Key residues surrounding the ligand are also labeled, as 
described herein below. 

Figure 7 is a diagram of molecular interactions between GR and 
dexamethasone. Both Van der Waals contacts and hydrogen bonds are indicated 
with dotted lines. 

Figure 8 is a wire frame diagram showing the structure around the F602 
mutation in the GRa LBD polypeptide. The lipophilic F602 side-chain of the wild- 
type GRa protein would be located in a hydrophilic environment and could 
destabilize the protein. Changing the phenylalanine (F) to a serine (S) allows the 
S602 side-chain and NH group to make direct hydrogen bonds with two water 
molecules (1H20 and 2H20). Other residues involved with the two water 
molecules are also shown and are described herein below. 



Brief Description of Sequences in the Sequence Listing 
SEQ ID NOs:1 and 2 are, respectively, a DNA sequence encoding a wild 
type full-length human glucocorticoid receptor (GenBank Accession No. 31679) 
and the amino acid sequence (GenBank Accession No. 121069) of a human 
20 glucocorticoid receptor encoded by the DNA sequence. 

SEQ ID NOs:3 and 4 are, respectively, a DNA sequence encoding a F602S 
full-length human glucocorticoid receptor and the amino acid sequence of a 
human glucocorticoid receptor encoded by the DNA sequence. 

SEQ ID NOs:5 and 6 are, respectively, a DNA sequence encoding a F602D 
full-length human glucocorticoid receptor and the amino acid sequence of a 
human glucocorticoid receptor encoded by the DNA sequence. 

SEQ ID NOs:7 and 8 are, respectively, a DNA sequence encoding a 
preferred embodiment of a full-length human glucocorticoid receptor of the 
present invention and the amino acid sequence of a human glucocorticoid 
receptor encoded by the DNA sequence. These sequences thus include variable 
amino acids at the following locations: V552, W557, F602, L636, Y648, W712, 
L741, L535, V538, C638, M691, V702, Y648, Y660, L685, M691, V702, W712, 
L733, and Y764, thus reflecting the mutagenesis approach of the present 
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invention disclosed herein below. Thus, a full length human glucocorticoid 
receptor of the present invention can include a mutation at any one of these 
residues, and/or at any combination of these residues. 

SEQ ID NOs:9 and 10 are, respectively, a DNA sequence encoding a wild 
5 type ligand binding domain of a human glucocorticoid receptor and the amino acid 
sequence of a human glucocorticoid receptor encoded by the DNA sequence. 

SEQ ID NOs:11 and 12 are, respectively, a DNA sequence encoding a 
ligand binding domain (residues 521-777) of a human glucocorticoid receptor 
containing a phenylalanine to serine mutation at residue 602 and the amino acid 
1 0 sequence of a human glucocorticoid receptor encoded by the DNA sequence. 

SEQ ID NOs:13 and 14 are, respectively, a DNA sequence encoding a 
ligand binding domain (residues 521-777) of a human glucocorticoid receptor 
containing a phenylalanine to aspartic acid mutation at residue 602 and the amino 
acid sequence of a human glucocorticoid receptor encoded by the DNA 
15 sequence. 

SEQ ID NOs:15 and 16 are, respectively, a DNA sequence encoding a 
preferred embodiment of a ligand binding domain of a human glucocorticoid 
receptor of the present invention and the amino acid sequence of a human 
glucocorticoid receptor encoded by the DNA sequence. These sequences thus 
20 include variable amino acids at the following locations: V552, W557, F602, L636, 
Y648, W712, L741, L535, V538, C638, M691, V702, Y648, Y660, L685, M691, 
V702, W712, L733, and Y764, thus reflecting the mutagenesis approach of the 
present invention disclosed herein below. Thus, a ligand binding domain of a 
human glucocorticoid receptor of the present invention can include a mutation at 
25 any one of these residues, and/or at any combination of these residues. 

SEQ ID NO:17 is an amino acid sequence of amino acid residues 732-756 
of the human TIF2 protein. 

SEQ ID NO:18 is an LXXLL motif of the human TIF2 protein. 

SEQ ID NOs:19-20 are oligonucleotide primers used to engineer a 
30 polyhistidine tag in frame to the sequence encoding glutathione S-transferase 
(GST). 

SEQ ID NO:21 is the resulting amino acid sequence of the modified GST. 
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SEQ ID NOs:22-25 are oligonucleotide primers used in the mutagenesis 
approach of the present invention. 

SEQ ID NOs:26-31 are the ligand binding domain polypeptides of MR(SEQ 
ID NO:26), PR(SEQ ID NO:27), AR(SEQ ID NO:28), ' ERa(SEQ ID N0 29) 
ER0(SEQ ID NO.30), and F602S GRa(SEQ ID NO:31) respectively. All of these 
sequences are also shown in Figure 5. Note that the GRa sequence shown of 
SEQ ID NO:31 starts at residue 527, whereas the F602S sequence of SEQ ID 
NO: 12 starts at residue 521 . 

SEQ ID NO:32 is an amino acid sequence of a ligand binding domain 
(res.dues 521-777) of a human glucocorticoid receptor containing a phenylalanine 
to serine mutation at residue 602, wherein the first two residues comprise a 
thrombin cleavage site encoded by vector. 

SEQ ID NO: 33 is an amino acid sequence of a ligand binding domain 
(residues 521-777) of a human glucocorticoid receptor comprising a W557R 
mutation. 

SEQ ID NO: 34 is an amino acid sequence of a ligand binding domain 
(residues 521-777) of a human glucocorticoid receptor comprising a Q615L 
mutation. 

SEQ ID NO: 35 is an amino acid sequence of a ligand binding domain 
(residues 521-777) of a human glucocorticoid receptor comprising a Q615H 
mutation. 

SEQ ID NO: 36 is an amino acid sequence of a ligand binding domain 
(residues 521-777) of a human glucocorticoid receptor comprising a A574T 
mutation. 

SEQ ID NO: 37 is an amino acid sequence of a ligand binding domain 
(residues 521-777) of a human glucocorticoid receptor comprising a L620M 
mutation. 

SEQ ID NO: 38 is an amino acid sequence of a ligand binding domain 
(residues 521-777) of a human glucocorticoid receptor comprising the double 
mutation F602L/A580T. 

SEQ ID NO: 39 is an amino acid sequence of a ligand binding domain 
(residues 521-777) of a human glucocorticoid receptor comprising the double 
mutation L563F/G583C. 
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SEQ ID NO: 40 is an amino acid sequence of a ligand binding domain 
(residues 521-777) of a human glucocorticoid receptor comprising the double 
mutation L664H/M752T. 

SEQ ID NO: 41 is an amino acid sequence of a ligand binding domain 
5 (residues 521-777) of a human glucocorticoid receptor comprising the double 
mutation L563F/T744N. 

Detailed Description of the Invention 
The present invention provides for the generation of NR, SR and GR 

10 polypeptides and NR, SR or GR mutants (preferably GRa and GRa LBD 
mutants), and the ability to solve the crystal structures of those that crystallize. 
Indeed, a GRa LBD having a point mutation was crystallized and solved in one 
aspect of the present invention. Thus, an aspect of the present invention involves 
the use of both targeted and random mutagenesis of the GR gene for the 

15 production of a recombinant protein with improved solution characteristics for. the 
purpose of crystallization, characterization of biologically relevant protein-protein 
interactions, and compound screening assays. The present invention, relating to 
GR LBD F602S and other LBD mutations, shows that GR can be overexpressed 
using an E.coli expression system and that active GR protein can be purified, 

20 assayed, and crystallized. 

Until disclosure of the present invention presented herein, the ability to 
obtain crystalline forms of the ligand binding domain of GRa has not been 
realized. And until disclosure of the present invention presented herein, a detailed 
three-dimensional crystal structure of a GRa LBD polypeptide has not been 

25 solved. 

In addition to providing structural information, crystalline polypeptides 
provide other advantages. For example, the crystallization process itself further 
purifies the polypeptide, and satisfies one of the classical criteria for homogeneity. 
In fact, crystallization frequently provides unparalleled purification quality, 
30 removing impurities that are not removed by other purification methods such as 
HPLC, dialysis, conventional column chromatography, and other methods. 
Moreover, crystalline polypeptides are sometimes stable at ambient temperatures 
and free of protease contamination and other degradation associated with solution 
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storage. Crystalline polypeptides can also be usefu. as pharmaceutical 
preparat,on S . Finally, crystallization techniques in general are largely free of 
problems such as denaturation associated with other stabilization methods (eg 
lyophilization). Once crystallization has been accomplished, crystal.ographic data 
. Provdes usefu. structural information that can assist the design of compounds that 
can serve as modulators (e.g. agonists or antagonists), as described herein 
below. In addition, the crystal structure provides information useful to map a 
receptor binding domain, which can then be mimicked by a chemical entity that 
can serve as an antagonist or agonist. 

L Definitions 

Following long-standing patent law convention, the terms "a" and "an" 
mean "one or more" when used in this application, including the claims. 

As used herein, the term "agonist" means an agent that supplements or 
potentates the bioactivity of a functional GRgene or protein or of a polypeptide 
encoded by a gene that is up- or down-regulated by a GR polypeptide and/or a 
polypeptide encoded by a gene that contains a GR binding site or response 
element in its promoter region. By way of specific example, an "agonist' is a 
compound that interacts with the steroid hormone receptor to promote a 
transcriptional response. An agonist can induce changes in a receptor that places 
the receptor in an active conformation that allows them to influence transcription 
e.ther positively or negatively. There can be several different ligand-induced 
changes ,n the receptors conformation. The term "agonist" specifically 
encompasses partial agonists. 

As used herein, the terms "cc-helix", "alpha-helix" and "alpha helix" are used 
mterchangeably and mean the conformation of a polypeptide chain wherein the 
polypeptide backbone is wound around the long axis of the molecule in a left- 
handed or right-handed direction, and the R groups of the amino acids protrude 
outward from the helical backbone, wherein the repeating unit of the structure is a 
single turnoff the helix, which extends about 0.56 nm along the long axis. 

As used herein, the term "antagonist" means an agent that decreases or 
.nh,b.ts the bioactivity of a functional GR gene or protein, or that supplements or 
potentiates the bioactivity of a naturally occurring or engineered non-functional GR 
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gene or protein. Alternatively, an antagonist can decrease or inhibit the bioactivity 
of a functional gene or polypeptide encoded by a gene that is up- or down- 
regulated by a GR polypeptide and/or contains a GR binding site or response 
element in its promoter region. An antagonist can also supplement or potentiate 
5 the bioactivity of a naturally occurring or engineered non-functional gene or 
polypeptide encoded by a gene that is up- or down-regulated by a GR 
polypeptide, and/or contains a GR binding site or response element in its 
promoter region. By way of specific example, an "antagonist" is a compound that 
interacts with the steroid hormone receptor to inhibit a transcriptional response. 

10 An antagonist can bind to a receptor but fail to induce conformational changes 
that alter the receptor's transcriptional regulatory properties or physiologically 
relevant conformations. Binding of an antagonist can also block the binding and 
therefore the actions of an agonist. The term "antagonist" specifically 
encompasses partial antagonists. 

15 As used herein, the terms "p-sheet", "beta-sheet" and "beta sheet" are used 

interchangeably and mean the conformation of a polypeptide chain stretched into 
an extended zig-zig conformation. Portions of polypeptide chains that run 
"parallel" all run in the same direction. Polypeptide chains that are "antiparallel" 
run in the opposite direction from the parallel chains. 

20 As used herein, the terms "binding pocket of the GR ligand binding 

domain", "GR ligand binding pocket" and. "GR binding pocket" are used 
interchangeably, and refer to the large cavity within the GR ligand binding domain 
where a ligand can bind. This cavity can be empty, or can contain water 
molecules or other molecules from the solvent, or can contain ligand atoms. The 

25 main binding pocket is the region of space encompassed the residues depicted 
Figure 7. The binding pocket also includes regions of space near the "main" 
binding pocket that not occupied by atoms of GR but that are near the "main" 
binding pocket, and that are contiguous with the "main" binding pocket. 

As used herein, the term "biological activity" means any observable effect 

30 flowing from interaction between a GR polypeptide and a ligand. Representative, 
but non-limiting, examples of biological activity in the context of the present 
invention include transcription regulation, ligand binding and peptide binding. 
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As used herein, the terms "candidate substance" and "candidate 
compound" are used interchangeably and refer to a substance that is believed to 
interact with another moiety, for example a given ligand that is believed to interact 
wrth a complete, or a fragment of, a GR polypeptide, and which can be 
subsequently evaluated for such an interaction. Representative candidate 
substances or compounds include xenobiotics such as drugs and other 
therapeutic agents, carcinogens and environmental pollutants, natural products 
and extracts, as well as endobiotics such as glucocorticosteroids, steroids fatty 
acds and prostaglandins. Other examples of candidate compounds that can be 
10 ,nvest,gated using the methods of the present invention include, but are not 
restated to, agonists and antagonists of a GR polypeptide, toxins and venoms 
«ra eprtopes, hormones (e.g., glucocorticosteroids, opioid peptides, steroids 
etc.), hormone receptors, peptides, enzymes, enzyme substrates, co-factors 
lecbns, sugars, oligonucteotides or nucleic acids, oligosaccharides, proteins, small 
i s molecules and monoclonal antibodies. 

As used herein, the terms "cells," "host cells" or "recombinant host cells- 
are used interchange^ and mean not only to the particular subject cell, but also 
to the progeny or potentia! progeny of such a cell. Because certain modifications 
can occur in succeeding generafions due to either mutation or environmental 
.nfluences, such progeny might not, in fact, be identical to the parent cell, but are 
sfcll included within the scope of the term as used herein. 

As used herein, the terms "chimeric protein" or fusion protein" ara used 
mterchangeably and mean a fusion of a first amino acid sequence encoding a GR 
polypeptide with a second amino acid sequence defining a polypeptide domain 
fore,gn to, and not homologous with, any domain of a GR polypeptide. A chimeric 
protein can include a foreign domain that is found in an organism that also 
expresses the first protein, or it can be an "interspecies" or "interoenic" fusion of 
prote.n structures expressed by different kinds of organisms. In general, a fusion 
proton can be represented by the general formula X-GR-Y, wherein GR 
represents a portion of the protein which is derived from a GR polypeptide and X 
and Y are independently absent or represent amino acid sequences which are not 
related to a GR sequence in an organism, which includes naturally occumno 
mutants. 
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As used herein, the term "co-activator" means an entity that has the ability 
to enhance transcription when it is bound to at least one other entity. The 
association of a co-activator with an entity has the ultimate effect of enhancing the 
transciption of one or more sequences of DNA. In the context of the present 
5 invention, transcription is preferably nuclear receptor-mediated. By way of 
specific example, in the present invention TIF2 (the human analog of mouse 
glucocorticoid receptor interaction protein 1 (GRIP1)) can bind to a site on the 
glucorticoid receptor, an event that can enhance transcription. TIF2 is therefore a 
co-activator of the glucocorticoid receptor. Other GR co-activators can include 
10 SRC1. 

As used herein, the term "co-repressor" means an entity that has the ability 
to repress transcription when it is bound to at least one other entity. In the context 
of the present invention, transcription is preferably nuclear receptor-mediated. 
The association of a co-repressor with an entity has the ultimate effect of 
15 repressing the transciption of one or more sequences of DNA. 

As used herein, the term "crystal lattice" means the array of points defined 
by the vertices of packed unit cells. 

As used herein, the term "detecting" means confirming the presence of a 
target entity by observing the occurrence of a detectable signal, such as a 
20 radiologic or spectroscopic signal that will appear exclusively in the presence of 
the target entity. 

As used herein, the term "DNA segment" means a DNA molecule that has 
been isolated free of total genomic DNA of a particular species. In a preferred 
embodiment, a DNA segment encoding a GR polypeptide refers to a DNA 

25 segment that comprises any of the odd numbered SEQ ID NOs:1-16, but can 
optionally comprise fewer or additional nucleic acids, yet is isolated away from, or 
purified free from, total genomic DNA of a source species, such as Homo sapiens. 
Included within the term "DNA segment" are DNA segments and smaller 
fragments of such segments, and also recombinant vectors, including, for 

30 example, plasmids, cosmids, phages, viruses, and the like. 

As used herein, the term "DNA sequence encoding a GR polypeptide" can 
refer to one or more coding sequences within a particular individual. Moreover, 
certain differences in nucleotide sequences can exist between individual 
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organisms, which are called alleles. It Is possible that such allelic differences 
might or might not result in differences in amino acid sequence of the encoded 
polypeptide yet still encode a protein with the same biological activity. As is well 
known, genes for a particular polypeptide can exist in single or multiple copies 
within the genome of an individual. Such duplicate genes can be identical or can 
have certain modifications, including nucleotide substitutions, additions or 
deletions, all of which still code for polypeptides having substantially the same 
activity. 

As used herein, the phrase "enhancer-promoter" means a composite unit 
that contains both enhancer and promoter elements. An enhancer-promoter is 
operatives linked to a coding sequence that encodes at least one gene product. 

As used herein, the term "expression" generally refers to the cellular 
processes by which a biologically active polypeptide is produced. 

As used herein, the term "gene" is used for simplicity to refer to a functional 
protein, polypeptide or peptide encoding unit. As will be understood by those in 
the art, this functional term includes both genomic sequences and cDNA 
sequences. Preferred embodiments of genomic and cDNA sequences are 
disclosed herein. 

As used herein, the term "glucocorticoid" means a steroid hormone 
glucocorticoid. "Glucocorticoids" are agonists for the glucocorticoid receptor. 
Compounds which mimic glucocorticoids are also be defined as glucocorticoid 
receptor agonists. A preferred glucocorticoid receptor agonist is dexamethasone. 
Other common glucocorticoid receptor agonists include Cortisol, cortisone, 
prednisolone, prednisone, methylprednisolone, trimcinolone, hydrocortisone, and 
corticosterone. As used herein, glucocorticoid is intended to include, for example, 
the following generic and brand name corticosteroids: cortisone (CORTONE 
ACETATE, ADRESON, ALTESONA, CORTELAN, CORTISTAB, CORTISYL, 
CORTOGEN, CORTONE, SCHEROSON); dexamethasone-oral (DECADRON- 
ORAL, DEXAMETH, DEXONE, HEXADROL-ORAL, DEXAMETHASONE 
INTENSOL, DEXONE 0.5, DEXONE 0.75, DEXONE 1.5, DEXONE 4)- 
hydrocortisone-oral (CORTEF, HYDROCORTONE); hydrocortisone cypionate 
(CORTEF ORAL SUSPENSION); methylprednisolone-oral (MEDROL-ORAL)- 
prednisolone-oral (PRELONE, DELTA-CORTEF, PEDIAPRED, ADNISOLONe' 



<WO_0301 5692A2J_> 



WO 03/015692 PCT/US02/22648 

-26- 

CORTALONE, DELTACORTRIL, DELTASOLONE, DELTASTAB, DI-ADRESON 
F, ENCORTOLONE, HYDROCORTANCYL, MEDISOLONE, M ETI CO RTELO N E , 
OPREDSONE, PANAAFCORTELONE, PRECORTISYL, PRENISOLONA, 
SCHERISOLONA, SCHERISOLONE); prednisone (DELTASONE, LIQUID PRED, 
5 METICORTEN, ORASONE 1, ORASONE 5, ORASONE 10, ORASONE 20, 
ORASONE 50, PREDNICEN-M, PREDNISONE INTENSOL, STERAPRED, 
STERAPRED DS, ADASONE, CARTANCYL, COLISONE, CORDROL, CORTAN, 
DACORTIN, DECORTIN, DECORTISYL, DELCORTIN, DELLACORT, DELTA- 
DOME, DELTACORTENE, DELTISONA, DIADRESON, ECONOSONE, 

10 ENCORTON, FERNISONE, NISONA, NOVOPREDNISONE, PANAFCORT, 
PANASOL, PARACORT, PARMENISON, PEHACORT, PREDELTIN, 
PREDNICORT, PREDNICOT, PREDNIDIB, PREDNIMENT, RECTODELT, 
ULTRACORTEN, WINPRED); triamcinolone-oral (KENACORT, ARISTOCORT, 
ATOLONE, SHOLOG A, TRAMACORT-D, TRI-MED, TRIAMCOT, TRISTO-PLEX, 

15 TRYLONE D, U-TRI-LONE). 

As used herein, the term "glucocorticoid receptor," abbreviated herein as 
"GR," means the receptor for a steroid hormone glucocorticoid. A glucocorticoid 
receptor is a steroid receptor and, consequently, a nuclear receptor, since steroid 
receptors are a subfamily of the superfamily of nuclear receptors. The term "GR" 

20 means any polypeptide sequence that can be aligned with human GR such that at 
least 70%, preferably at least 75%, of the amino acids are identical to the 
corresponding amino acid in the human GR. The term "GR" also encompasses 
nucleic acid sequences where the corresponding translated protein sequence can 
be considered to be a GR. The term "GR" includes invertebrate homologs, 

25 whether now known or hereafter identified; preferably, GR nucleic acids and 
polypeptides are isolated from eukaryotic sources. "GR" further includes 
vertebrate homologs of GR family members, including, but not limited to, 
mammalian and avian homologs. Representative mammalian homologs of GR 
family members include, but are not limited to, murine and human homologs. 

30 "GR" specifically encompasses all GR isoforms, including GRa and GRp. GRp is 
a splicing variant with 100% identity to GRa, except at the C-terminus, where 50 
residues in GRa have been replaced with 15 residues in GRP. 
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As used herein, the terms "GR gene product", "GR protein", "GR 
polypeptide", and "GR peptide" are used interchangeably and mean peptides 
having amino aoid sequenoes whioh are substantially identical to native amino 
acd sequences from the organism of interest and which are biologically active in 
• that they comprise all or a part of the amino acid sequence of a GR polypeptide 
or cross-react wim antibodies raised against a GR pdypeptide, or retain a,l or 
some of the biological activity (e.g.. DNA or llgand binding ability and/or 
franscnptiona, regulation) of the native amino acid sequence or protein. Such 
b.olog,cal activity can include immunogenic^. Representative embodiments are 
set forth ,n any even numbered SEQ ID MOs:2-16. The terms "GR gene product" 
GR protein", "GR polypeptide", and "GR peptide" also include analogs of a GR 
polypeptide. By "analog" is intended mat a DNA or peptide sequence can contain 
alterations relative to the sequences disclosed herein, yet retain al, or some of the 
b,ologica, activity of those sequences. Analogs can be derived from genomic 
nucleotide sequences as are disclosed herein or from other organisms, or can be 
created synthetically. Those skilled in the art will appreciate that other analogs as 
yet undisclosed or undiscovered, can be used to design and/or construct GR 
analogs. There is no need for a "GR gene product". "GR protein" "GR 
polypeptide", or "GR peptide" to comprise all or substantially all of the amino acid 
sequence of a GR polypeptide gene product. Shorter or longer sequences are 
antapated to be of use in the invention; shorter sequences are herein referred to 
as segments". Thus, the terms "GR gene product", "GR protein" "GR 
polypeptide", and "GR peptide" also include fusion or recombinant GR 
polypeptides and proteins comprising sequences of the present invention 
Methods of preparing such proteins are disclosed herein and are known in the art 
As used herein, the terms "GR gene" and "recombinant GR gene" mean a 
nucle,c acid molecule comprising an open reading frame encoding a GR 
polypept.de of the present invention, including both exon and (optionally) intron 
sequences. 

As used herein, "hexagonal unit cell" means a unit cell wherein a = b * c - 
and a = p = 90, y = 120». The vectors a, b and c describe the unit cell edges and 
the angles a, p, and y describe the unit cell angles. In a preferred embodiment of 
the present invention, the unit cell has lattice constants of a = b =126 014 A c = 
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86.312 A, a = 90°, p = 90°, y = 120°. While preferred lattice constants are 
provided, a crystalline polypeptide of the present invention also comprises 
variations from the preferred lattice constants, wherein the varations range from 
about one to about two percent. Thus, for example, a crystalline polypeptide of 
5 the present invention can also comprise lattice constants of about 125 or about 
127. 

As used herein, the term "hybridization" means the binding of a probe 
molecule, a molecule to which a detectable moiety has been bound, to a target 
sample. 

10 As used herein, the term "interact" means detectable interactions between 

molecules, such as can be detected using, for example, a yeast two hybrid assay. 
The term "interact" is also meant to include "binding" interactions between 
molecules. Interactions can, for example, be protein-protein or protein-nucleic 
acid in nature. 

15 As used herein, the term "intron" means a DNA sequence present in a 

given gene that is not translated into protein. 

As used herein, the term "isolated" means oligonucleotides substantially 

free of other nucleic acids, proteins, lipids, carbohydrates or other materials with 

which they can be associated, such association being either in cellular material or 
20 in a synthesis medium. The term can also be applied to polypeptides, in which 

case the polypeptide will be substantially free of nucleic acids, carbohydrates, 

lipids and other undesired polypeptides. 

As used herein, the term "labeled" means the attachment of a moiety, 

capable of detection by spectroscopic, radiologic or other methods, to a probe 
25 molecule. 

As used herein, the term "modified" means an alteration from an entity's 
normally occurring state. An entity can be modified by removing discrete chemical 
units or by adding discrete chemical units. The term "modified" encompasses 
detectable labels as well as those entities added as aids in purification. 
30 As used herein, the term "modulate" means an increase, decrease, or other 

alteration of any or all chemical and biological activities or properties of a wild-type 
or mutant GR polypeptide, preferably a wild-type or mutant GR polypeptide. The 
term "modulation" as used herein refers to both upregulation (i.e., activation or 
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stimulation) and downregulation (i.e. inhibition or suppression) of a response and 
.ncludes responses that are upregulated in one cell type or tissue, and down- 
regulated in another cell type or tissue. 

As used herein, the term "molecular replacement" means a method that 
involves generating a preliminary model of the wild-type GR ligand binding 
domain, or a GR mutant crystal whose structure coordinates are unknown, by 
orienting and positioning a molecule or model whose structure coordinates are 
known (e.g., a nuclear receptor) within the unit cell of the unknown crystal so as 
best to account for the observed diffraction pattern of the unknown crystal 
Phases can then be calculated from this model and combined with the observed 
amplitudes to give an approximate Fourier synthesis of the structure whose 
coordinates are unknown. This, in turn, can be subject to any of the several forms 
of refinement to provide a final, accurate structure of the unknown crystal. See 
e.g., Lattman, (1985) Method EnzymoL, 115: 55-77; Rossmann , ed, (1972) The 
Molecular Replacement Method, Gordon & Breach, New York. Using the structure 
coordinates of the ligand binding domain of GR provided by this invention 
molecular replacement can be used to determine the structure coordinates of a 
crystalline mutant or homologue of the GR ligand binding domain, or of a different 
crystal form of the GR ligand binding domain. 

As used herein, the term "mutation" carries its traditional connotation and 
means a change, inherited, naturally occurring or introduced, in a nucleic acid or 
polypeptide sequence, and is used in its sense as generally known to those of skill 
in the art. 

As used herein, the term "nuclear receptor", occasionally abbreviated 
herein as "NR", means a member of the superfamily of receptors that comprises 
at least the subfamilies of steroid receptors, thryroid hormone receptors, retinoic 
acid receptors and vitamin D receptors. Thus, a given nuclear receptor can be 
further classified as a member of a subfamily while retaining its status as a 
nuclear receptor. 

As used herein, the phrase "operatively linked" means that an enhancer- 
promoter is connected to a coding sequence in such a way that the transcription 
of that coding sequence is controlled and regulated by that enhancer-promoter. 
Techniques for operatively linking an enhancer-promoter to a coding sequence 
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are well known in the art; the precise orientation and location relative to a coding 
sequence of interest is dependent, inter alia, upon the specific nature of the 
enhancer-promoter. 

As used herein, the term "partial agonist" means an entity that can bind to a 
5 receptor and induce only part of the changes in the receptors that are induced by 
agonists. The differences can be qualitative or quantitative. Thus, a partial 
agonist can induce some of the conformation changes induced by agonists, but 
not others, or it can only induce certain changes to a limited extent. 

As used herein, the term "partial antagonist" means an entity that can bind 

10 to a receptor and inhibit only part of the changes in the receptors that are induced 
by antagonists. The differences can be qualitative or quantitative. Thus, a partial 
antagonist can inhibit some of the conformation changes induced by an 
antagonist, but not others, or it can inhibit certain changes to a limited extent. 

As used herein, the term "polypeptide" means any polymer comprising any 

15 of the 20 protein amino acids, regardless of its size. Although "protein" is often 
used in reference to relatively large polypeptides, and "peptide" is often used in 
reference to small polypeptides, usage of these terms in the art overlaps and 
varies. The term "polypeptide" as used herein refers to peptides, polypeptides 
and proteins, unless otherwise noted. As used herein, the terms "protein", 

20 "polypeptide" and "peptide" are used interchangeably herein when referring to a 
gene product. 

As used herein, the term "primer" means a sequence comprising two or 
more deoxyribonucleotides or ribonucleotides, preferably more than three, and 
more preferably more than eight and most preferably at least about 20 nucleotides 
25 of an exonic or intronic region. Such oligonucleotides are preferably between ten 
and thirty bases in length. 

As used herein, the term "sequencing" means the determining the ordered 
linear sequence of nucleic acids or amino acids of a DNA or protein target sample, 
using conventional manual or automated laboratory techniques. 
30 As used herein, the term "space group" means the arrangement of 

symmetry elements of a crystal. 

As used herein, the term "steroid receptor" means a nuclear receptor that 
can bind or associate with a steroid compound. Steroid receptors are a subfamily 
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of the superfamily of nuclear receptors. The subfamily of steroid receptors 
comprises glucocorticoid receptors and, therefore, a glucocorticoid receptor is a 
member of the subfamily of steroid receptors and the superfamily of nuclear 
receptors. 

As used herein, the terms "structure coordinates" and "structural 
coordinates" mean mathematical coordinates derived from mathematical 
equations related to the patterns obtained on diffraction of a monochromatic beam 
of X-rays by the atoms (scattering centers) of a molecule in crystal form. The 
diffraction data are used to calculate an electron density map of the repeating unit 
of the crystal. The electron density maps are used to establish the positions of the 
individual atoms within the unit cell of the crystal. 

Those of skill in the art understand that a set of coordinates determined by 
X-ray crystallography is not without standard error. In general, the error in the 
coordinates tends to be reduced as the resolution is increased, since more 
experimental diffraction data is available for the model fitting and refinement. 
Thus, for example, more diffraction data can be collected from a crystal that 
diffracts to a resolution of 2.8 angstroms than from a crystal that diffracts to a 
lower resolution, such as 3.5 angstroms. Consequently, the refined structural 
coordinates will usually be more accurate when fitted and refined using data from 
a crystal that diffracts to higher resolution. The design of ligands and modulators 
for GR or any other NR depends on the accuracy of the structural coordinates. If 
the coordinates are not sufficiently accurate, then the design process will be 
ineffective. In most cases, it is very difficult or impossible to collect sufficient 
diffraction data to define atomic coordinates precisely when the crystals diffract to 
a resolution of only 3.5 angstroms or poorer. Thus, in most cases, it is difficult to 
use X-ray structures in structure-based ligand design when the X-ray structures 
are based on crystals that diffract to a resolution of only 3.5 angstroms or 
poorer. However, common experience has shown that crystals diffracting to 2.8 
angstroms or better can yield X-ray structures with sufficient accuracy to greatly 
facilitate structure-based drug design. Further improvement in the resolution can 
further facilitate structure-based design, but the coordinates obtained at 2.8 
angstroms resolution are generally adequate for most purposes. 

Also, those of skill in the art will understand that NR proteins can adopt 
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different conformations when different ligands are bound. In particular, NR 
proteins will adopt substantially different conformations when agonists and 
antagonists are bound. Subtle variations in the conformation can also occur when 
different agonists are bound, and when different antagonists are bound. These 
5 variations can be difficult or impossible to predict from a single X-ray structure. 
Generally, structure-based design of GR modulators depends to some degree on 
a knowledge of the differences in conformation that occur when agonists and 
antagonists are bound. Thus, structure-based modulator design is most facilitated 
by the availability of X-ray structures of complexes with potent agonists as well as 

1 0 potent antagonists. 

As used herein, the term "substantially pure" means that the polynucleotide 
or polypeptide is substantially free of the sequences and molecules with which it is 
associated in its natural state, and those molecules used in the isolation 
procedure. The term "substantially free" means that the sample is at least 50%, 

15 preferably at least 70%, more preferably 80% and most preferably 90% free of the 
materials and compounds with which is it associated in nature. 

As used herein, the term "target cell" refers to a cell, into which it is desired 
to insert a nucleic acid sequence or polypeptide, or to otherwise effect a 
modification from conditions known to be standard in the unmodified cell. A 

20 nucleic acid sequence introduced into a target cell can be of variable length. 
Additionally, a nucleic acid sequence can enter a target cell as a component of a 
plasmid or other vector or as a naked sequence. 

As used herein, the term "transcription" means a cellular process involving 
the interaction of an RNA polymerase with a gene that directs the expression as 

25 RNA of the structural information present in the coding sequences of the gene. 
The process includes, but is not limited to the following steps: (a) the transcription 
initiation, (b) transcript elongation, (c) transcript splicing, (d) transcript capping, (e) 
transcript termination, (f) transcript polyadenylation, (g) nuclear export of the 
transcript, (h) transcript editing, and (i) stabilizing the transcript. 

30 As used herein, the term "transcription factor" means a cytoplasmic or 

nuclear protein which binds to such gene, or binds to an RNA transcript of such 
gene, or binds to another protein which binds to such gene or such RNA transcript 
or another protein which in turn binds to such gene or such RNA transcript, so as 
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to thereby modulate expression of the gene. Such modulation can additionally be 
ach,eved by other mechanisms; the essence of "transcription factor for a gene" is 
that the level of transcription of the gene is altered in some way. 

As used herein, the term "unit cell" means a basic parallelized shaped 
block. The entire volume of a crystal can be constructed by regular assembly of 
such blocks. Each unit cel. comprises a complete representation of the unit of 
pattern, the repetition of which builds up the crystal. Thus, the term "unit cell- 
means the fundamental portion of a crystal structure that is repeated infinitely by 
translate in three dimensions. A unit cell is characterized by three vectors a b 
and c, not located in one plane, which form the edges of a parallelepiped. Angles 
a, P and y define the angles between the vectors: angle a is the angle between 
vectors b and c; angle p is the angle between vectors a and c; and angle y is the 
angle between vectors a and b. The entire volume of a crystal can be constructed 
by regular assembly of unit cells; each unit cell comprises a complete 
representation of the unit of pattern, the repetition of which builds up the crystal. 

1L Description of Tables 

Table 1 is chart of sequence identity between the ligand binding domains of 
several nuclear receptors. 

Table 2 is a table listing mutations of the GR LBD (521-777) gene for 
testing solution solubility and stability. SEQ ID NOs:7-8 and 15-16 also comprise 
these mutations. Candidate mutated residues include but are not limited to Cys 
Asn, Tyr, Lys, Ser, Asp, Glu, Gin, Arg or Thr. 

Table 2A is a table listing mutations that were discovered using the Lacl- 
based "peptides-on-plasmids" technique with GR LBD. 

Table 3 is a table summarizing the crystal and data statistics obtained from 
the crystallized ligand binding domain of GRa LBD that was co-crystallized with 
dexamethasone and a fragment of the co-activator TIF2. Data on the unit cell are 
presented, including data on the crystal space group, unit cell dimensions 
molecules per asymmetric cell and crystal resolution. 

Table 4 is a table of .the atomic structure coordinate data obtained from X- 
ray diffraction from the ligand binding domain of GR (residues 521-777) in 
complex with desamethasone and a fragment of the co-activator TIF2. 
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Table 5 is a table of the atomic structure coordinates used as the initial 
model to solve the structure of the GR/TIF2/dexamethasone complex by 
molecular replacement. The GR model is a homology model built on the 
published structure of the progesterone receptor LBD and the SRC1 coactivator 
5 peptide from the PPARa/Compound 1/SRC1 structure. 

III. General Considerations 

The present invention will usually be applicable mutatis mutandis to nuclear 
receptors in general, more particularly to steroid receptors and even more 

10 particularly to glucocorticoid receptors, including GR isoforms, as discussed 
herein, based, in part, on the patterns of nuclear receptor and steroid receptor 
structure and modulation that have emerged as a consequence of the present 
disclosure, which in part discloses determining the three dimensional structure of 
the ligand binding domain of GRa in complex with dexamethasone and a 

1 5 fragment of the co-activator TIF2. 

The nuclear receptor superfamily has been subdivided into two subfamilies: 
the GR subfamily (also referred to as the steroid receptors and denoted SRs), 
comprising GR, AR (androgen receptor), MR (mineralcorticoid receptor) and PR 
(progesterone receptor) and the thyroid hormone receptor (TR) subfamily, 

20 comprising TR, vitamin D receptor (VDR), retinoic acid receptor (RAR), retinoid X 
receptor (RXR), and most orphan receptors. This division has been made on the 
basis of DNA binding domain structures, interactions with heat shock proteins 
(HSP), and ability to form dimers. 

Steroid receptors (SRs) form a subset of the superfamily of nuclear 

25 receptors. The glucocorticoid receptor is a steroid receptor and thus a member of 
the superfamily of nuclear receptors and the subset of steroid receptors. The 
human glucocorticoid receptor exists in two isoforms, GRa which consists of 777 
amino acids and GRp which consists of 742 amino acids. As noted, the alpha 
isoform of human glucocorticoid receptor is made up of 777 amino acids and is 

30 predominantly cytoplasmic in its unactivated, non-DNA binding form. When 
activated, it translocates to the nucleus. In order to understand the role played by 
the glucocorticoid receptor in the different cell processes, the receptor was 
mapped by transfecting receptor-negative and glucocorticoid-resistant cells with 
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different steroid receptor constructs and reporter genes .ike chioramphenico. 
acyltransferase (CAT) or .uciferase which had been covalentiy Hnked to a 
glucocorticoid responsive element (ORE). From these and other studies four 
major functional domains have become evident. 

From amino to carboxyl terminal end, these functional domains include the 
tau 1, DNA binding, and ligand binding domains in succession. The tau 1 domain 
spans amino acid positions 77-262 and regu.ates gene activation. The DNA 
b.nd,ng domain is from amino acid positions 421-486 and has nine cysteine 
rescues, eight of which are organized in the form of two zinc fingers analogous to 
Xenopus transcription factor I.IA. The DNA binding domain binds to the regulatory 
sequences of genes that are induced or deinduced by glucocorticoids. Amino 
ac.ds 521 to 777 form the ligand binding domain, which binds glucocorticoid to 
acfvate the receptor. This region of the receptor also has the nuclear localization 
s-gnal. Deletion of this carboxyl terminal end results in a receptor that is 
constitutive^ active for gene induction (up to 30% of wild type activity) and even 
more active for cel. kill (up to 150% of wild type activity) (G^ere^aL, (1986) 
Ce//46: 645-652; HpHenberg^, (1987) Cell 49: 39-46; HpHenbe^&Evans 
(1988) Cell 55: 899-906; HollenbeaetaT, (1989) Cancer Res. 49: 2292s-2294s- 
^eLaL. (1988) Cell 55: 1109-1114; Evans, (1989) in Recent Progress in 
y°nnoj2e_Resea^ (dark, ed.) Vol. 45, pp. 1-27, Academic Press, San Diego 
Calrfornia; Gjeen^Chajr^ (1987) Nature 325: 75-78; Picard & Yamamoto' 
(1987) EMBO J. 6: 3333-3340; Picard et aj , (1990) Cell Regul. 1: 291-2W- 
G2dow^kiet_aL, (1 9 87) Nature 325: 365-368; Miesfeld et a.. . (1987) Science 
236:423-427; DanieJsenetaL, (1989) Cancer Res. 49: 2286s-2291s; Danielsen et 
ai, (1987) Molec. Endocrinol. 1: 816-822; Umesono & Evans (1989) Cell 57" 
1 139-1 146.). Despite the aforementioned indirect characterization of the structure 
of GRcc, until the present disclosure, a detailed three-dimensional model of the 
ligand binding domain of GRa has not been achieved. 

GR subgroup members are tightly bound by heat shock protein(s) (HSP) in 
the absence of ligand, dimerize following ligand binding and dissociation of HSP 
and show homology in the DNA half sites to which they bind. These half sites 
also tend to be arranged as palindromes. TR subgroup members tend to be 
bound to DNA or other chromatin molecules when unliganded, can bind to DNA 
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as monomers and dimers, but tend to form heterodimers, and bind DNA elements 
with a variety of orientations and spacings of the half sites, and also show 
homology with respect to the nucleotide sequences of the half sites. ER does not 
belong to either subfamily, since it resembles the GR subfamily in hsp 
5 interactions, and the TR subfamily in nuclear localization and DNA-binding 
properties. 

Most members of the superfamily, including orphan receptors, possess at 
least two transcription activation subdomains, one of which is constitutive and 
resides in the amino terminal domain (AF-1), and the other of which (AF-2) 

10 resides in the ligand binding domain, whose activity is regulated by binding of an 
agonist ligand. The function of AF-2 requires an activation domain (also called 
transactivation domain) that is highly conserved among the receptor superfamily. 
Most LBDs contain an activation domain. Some mutations in this domain abolish 
AF-2 function, but leave ligand binding and other functions unaffected. Ligand 

15 binding allows the activation domain to serve as an interaction site for essential 
co-activator proteins that function to stimulate (or in some cases, inhibit) 
transcription. 

. Analysis and alignment of amino acid sequences, and X-ray and NMR 
structure determinations, have shown that nuclear receptors have a modular 
20 architecture with three main domains: 

1) a variable amino-terminal domain; 

2) a highly conserved DNA-binding domain (DBD); and 

3) a less conserved carboxy-terminal ligand binding domain (LBD). 

In addition, nuclear receptors can have linker segments of variable length 
25 between these major domains. Sequence analysis and X-ray crystallography, 
including the disclosure of the present invention, have confirmed that GR also has 
the same general modular architecture, with the same three domains. The 
function of GR in human cells presumably requires all three domains in a single 
amino acid sequence. However, the modularity of GR permits different domains 
30 of each protein to separately accomplish certain functions. Some of the functions 
of a domain within the full-length receptor are preserved when that particular 
domain is isolated from the remainder of the protein. Using conventional protein 
chemistry techniques, a modular domain can sometimes be separated from the 



SDOCID: <WO__0301 5692A2_I_> 



WO 03/015692 PCT/US02/22648 

-37- 

parent protein. Using conventional molecular biology techniques, each domain 
can usually be separately expressed with its original function intact or, as 
discussed herein below, chimeras comprising two different proteins can be 
constructed, wherein the chimeras retain the properties of the individual functional 
5 domains of the respective nuclear receptors from which the chimeras were 
generated. 

The carboxy-terminal activation subdomain, is in close three dimensional 
proximity in the LBD to the ligand, so as to allow for ligands bound to the LBD to 
coordinate (or interact) with amino acid(s) in the activation subdomain. As 

10 described herein, the LBD of a nuclear receptor can be expressed, crystallized, its 
three dimensional structure determined with a ligand bound (either using crystal 
data from the same receptor or a different receptor or a combination thereof), and 
computational methods used to design ligands to its LBD, particularly ligands that 
contain an extension moiety that coordinates the activation domain of the nuclear 

15 receptor. 

The LBD is the second most highly conserved domain in these receptors. 
As its name suggests, the LBD binds ligands. With many nuclear receptors, 
including GR. binding of the ligand can induce a conformational change in the 
LBD that can, in turn, activate transcription of certain target genes. Whereas 

20 integrity of several different LBD sub-domains is important for ligand binding, 
truncated molecules containing only the LBD retain normal ligand-binding activity. 
This domain also participates in other functions, including dimerization, nuclear 
translocation and transcriptional activation, as described herein. 

Nuclear receptors usually have HSP binding domains that present a region 

25 for binding to the LBD and can be modulated by the binding of a ligand to the 
LBD. For many of the nuclear receptors ligand binding induces a dissociation of 
heat shock proteins such that the receptors can form dimers in most cases, after 
which the receptors bind to DNA and regulate transcription. Consequently, a 
ligand that stabilizes the binding or contact of the heat shock protein binding 

30 domain with the LBD can be designed using the computational methods described 
herein. 

With the receptors that are associated with the HSP in the absence of the 
ligand. dissociation of the HSP results in dimerization of the receptors. 
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Dimerization is due to receptor domains in both the DBD and the LBD. Although 
the main stimulus for dimerization is dissociation of the HSP, the ligand-induced 
conformational changes in the receptors can have an additional facilitative 
influence. With the receptors that are not associated with HSP in the absence of 
5 the ligand, particularly with the TR. ligand binding can affect the pattern of 
dimerization. The influence depends on the DNA binding site context, and can 
also depend on the promoter context with respect to other proteins that can 
interact with the receptors. A common pattern is to discourage monomer 
formation, with a resulting preference for heterodimer formation over dimer 

1 0 formation on DNA. 

Nuclear receptor LBDs usually have dimerization domains that present a 
region for binding to another nuclear receptor and can be modulated by the 
binding of a ligand to the LBD. Consequently, a ligand that disrupts the binding or 
contact of the dimerization domain can be designed using the computational 

1 5 methods described herein to produce a partial agonist or antagonist. 

The amino terminal domain of GR is the least conserved of the three 
domains. This domain is involved in transcriptional activation and, its uniqueness 
might dictate selective receptor-DNA binding and activation of target genes by GR 
subtypes. This domain can display synergistic and antagonistic interactions with 

20 the domains of the LBD. 

The DNA binding domain has the most highly conserved amino acid 
sequence amongst the GRs. It typically comprises about 70 amino acids that fold 
into two zinc finger motifs, wherein a zinc atom coordinates four cysteines. The 
DBD comprises two perpendicularly oriented a-helixes that extend from the base 

25 of the first and second zinc fingers. The two zinc fingers function in concert along 
with non-zinc finger residues to direct the GR to specific target sites on DNA and 
to align receptor dimer interfaces. Various amino acids in the DBD influence 
spacing between two half-sites (which usually comprises six nucleotides) for 
receptor dimerization. The optimal spacings facilitate cooperative interactions 

30 between DBDs, and D box residues are part of the dimerization interface. Other 
regions of the DBD facilitate DNA-protein and protein-protein interactions are 
involved in dimerization. 
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ln nuclear receptors that bind to a HSP, the ligand-induced dissociation of 
HSP with consequent dimer formation allows, and therefore, promotes DNA 
binding. With receptors that are not associated (as in the absence of ligand), 
ligand binding tends to stimulate DNA binding of heterodimers and dimers, and to 
discourage monomer binding to DNA. However, with DNA containing only a 
single half site, the ligand tends to stimulate the receptor's binding to DNA. The 
effects are modest and depend on the nature of the DNA site and probably on the 
presence of other proteins that can interact with the receptors. Nuclear receptors 
usually have DBD (DNA binding domains) that present a region for binding to 
DNA and this binding can be modulated by the binding of a ligand to the LBD. 

The modularity of the members of the nuclear receptor superfamily permits 
different domains of each protein to separately accomplish different functions, 
although the domains can influence each other. The separate function of a 
domain is usually preserved when a particular domain is isolated from the 
remainder of the protein. Using conventional protein chemistry techniques a 
modular domain can sometimes be separated from the parent protein. By 
employing conventional molecular biology techniques each domain can usually be 
separately expressed with its original function intact or chimerics of two different 
nuclear receptors can be constructed, wherein the chimerics retain the properties 
of the individual functional domains of the respective nuclear receptors from which 
the chimerics were generated. 

Various structures have indicated that most nuclear receptor LBDs adopt 
the same general folding pattern. This fold consists of 10-12 alpha helices 
arranged in a bundle, together with several beta-strands, and linking segments. A 
25 preferred GRa LBD structure of the present invention has 10-11 helices, 
depending on whether helix-3' is counted. Structural studies have shown that 
most of the alpha-helices and beta-strands have the same general position and 
orientation in all nuclear receptor structures, whether ligand is bound or not. 
However, the AF2 helix has been found in different positions and orientations 
30 relative to the main bundle, depending on the presence or absence of the ligand, 
and also on the chemical nature of the ligand. These structural studies have 
suggested that many nuclear receptors share a common mechanism of activation, 
where binding of activating ligands helps to stabilize the AF2 helix in a position 



20 
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and orientation adjacent to helices-3, -4, and -10, covering an opening to the 
ligand binding site. This position and orientation of the AF2 helix, which will be 
called the "active conformation", creates a binding site for co-activators. See, e.g. , 
Nolte et al. , (1998) Nature 395:137-43; Shiau et al. y (1998) Cell 95: 927-37. This 
co-activator binding site has a central lipophilic pocket that can accommodate 
leucine side-chains from co-activators, as well as a "charge-clamp" structure 
consisting essentially of a lysine residue from helix-3 and a glutamic acid residue 
from the AF2 helix. 

Structural studies have shown that co-activator peptides containing the 
sequence LXXLL (where L is leucine and X can be a different amino acid in 
different cases) can bind to this co-activator binding site by making interactions 
with the charge clamp lysine and glutamic acid residues, as well as the central 
lipophilic region. This co-activator binding site is disrupted when the AF2 helix is 
shifted into other positions and orientations. In PPARy, activating ligands such as 
rosiglitazone (BRL49653) make a hydrogen bonding interaction with tyrosine-473 
in the AF2 helix. Nolte et al. , (1998) Nature 395:137-43; Gampe et al. t (2000) 
MoL Cell 5: 545-55. Similarly, in GR, the dexamethasone ligand makes van der 
Waals interaction with the side chain of leucine-753 from the AF2 helix. This 
interaction is believed in part to stabilize the AF2 helix in the active conformation, 
thereby allowing co-activators to bind and thus activating transcription from target 
genes. 

With certain antagonist ligands, or in the absence of any ligand, the AF2 
helix can be held less tightly in the active conformation, or can be free to adopt 
other conformations. This would either destabilize or disrupt the co-activator 
binding site, thereby reducing or eliminating co-activator binding and transcription 
from certain target genes. Some of the functions of the GR protein depend on 
having the full-length amino acid sequence and certain partner molecules, such as 
co-activators and DNA. However, other functions, including ligand binding and 
ligand-dependent conformational changes, can be observed experimentally using 
isolated domains, chimeras and mutant molecules. 

As described herein, the LBD of a GR can be mutated or engineered, 
expressed, crystallized, its three dimensional structure determined with a ligand 
bound as disclosed in the present invention, and computational methods can be 
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used to design ligands to nuclear receptors, preferably to steroid receptors, and 
more preferably to glucocorticoid receptors. 

IV- The Dexamethasone Ligand 
5 Ligand binding can induce transcriptional activation functions in a variety of 

ways. One way is through the dissociation of the HSP from receptors. This 
dissociation, with consequent dimerization of the receptors and their binding to 
DNA or other proteins in the nuclear chromatin, allows transcriptional regulatory 
properties of the receptors to be manifest. This can be especially true of such 

1 0 functions on the amino terminus of the receptors. 

Another way is to alter the receptor to interact with other proteins involved 
in transcription. These could be proteins that interact directly or indirectly with 
elements of the proximal promoter or proteins of the proximal promoter. 
Alternatively, the interactions can be through other transcription factors that 

15 themselves interact directly or indirectly with proteins of the proximal promoter. 
Several different proteins have been described that bind to the receptors in a 
ligand-dependent manner. In addition, it is possible that in some cases, the 
ligand-induced conformational changes do not affect the binding of other proteins 
to the receptor, but do affect their abilities to regulate transcription. 

20 In one aspect of the present invention, a GR LBD was co-crystallized with a 

fragment of the co-activator TIF2 and the ligand dexamethasone. 
Dexamethasone is a synthetic adrenocortical steroid with a molecular weight of 
392.47. The IUPAC name for dexamethasone is (11p, 16a)-9-fluoro-1 10,17,21- 
trihydroxy-16a-methylpregna-1-4-diene-3,20-dione. The empirical formula for 

25 dexamethasone is C 2 2H 2 9F0 5 . Dexamethasone is represented by the chemical 
structure: 
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HLC-OH 



c=o 




Dexamethasone-based therapeutics are commercially available in a variety 
of forms and formulations. Dexamethasone can also be purchased from various 
5 suppliers such as Sigma (St. Louis Misouri), as well as starting materials for the 
synthesis of dexamethasone. The synthesis of dexamethasone, and 
dexamethasone derivatives, is known and described in a variety of sources, 
including Arth et al. , (1958) J. Am. Chem. Soc. 80: 3161; Oliveto et al. , (1958) J. 
Am. Chem. Soc. 4431, Fried & Sabo , (1954) J. Am Chem. Soc. 76: 1455; 
10 Hirschman et al. , (1956) J. Am. Chem. Soc. 78: 4957 and U.S. Patent No. 
3,007,923 to Muller et al. , all of which are incorporated herein in their entirety. 



V. The TIF2 Fragment 

15 The nuclear receptor co-activator TIF2 (SEQ ID NO:17) was co-crystallized 

in one aspect of the present invention. Structurally, the nuclear receptor 
coactivator TIF2 comprises one domain that reacts with a nuclear receptor 
(nuclear receptor interaction domain, abbreviated "NID") and two autonomous 
activation domains, AD1 and AD2 (Voegel et al. , (1998) EMBO J. 17: 507-519). 

20 The TIF2 NID comprises three NR-interacting modules, with each module 
comprising the motif, LXXLL (SEQ ID NO: 18) ( Voegel et al. , (1998) EMBO J. 17: 
507-519). Mutation of the motif abrogates TIF2's ability to interact with the ligand- 
induced activation function-2 (AF-2) found in the ligand-binding domains (LBDs) of 
many NRs. Presently, it is thought that TIF2 AD1 activity is mediated by CREB 

25 binding protein (CBP), however, TIF2 AD2 activity does not appear to involve 
interaction with CBP ( Voegel et al. , (1998) EMBO J. 17: 507-519). 
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ln the present invention, residues 732-756 of the TIF2 protein (SEQ ID 
NO:17) were co-crystallized with GR and dexamethasone. These residues 
comprise the LXXLL (SEQ ID NO:18) of AD-2, the third motif in the linear 
sequence of TIF2. The TIF2 fragment is 25 residues in length and was 
5 synthesized using an automated peptide synthesis apparatus. SEQ ID NO:17, 
and other sequences corresponding to TIF2 and other co-activators and co- 
repressors, can be similarly synthesized using automated apparatuses. 

VL Production of NR, SR and GR Polypeptides 
10 ,n a Preferred embodiment, the present invention provides for the first time 

for the expression of a soluble GR polypeptide in bacteria, more preferably, in E. 
coli. The GR polypeptides of the present invention, disclosed herein, can thus 
now provide a variety of host^expression vector systems to express an NR, SR or 
GR coding sequence. These include but are not limited to microorganisms such 
as bacteria transformed with recombinant bacteriophage DNA, plasmid DNA or 
cosmid DNA expression vectors containing an NR, SR or GR coding sequence; 
yeast transformed with recombinant yeast expression vectors containing an NR, 
SR or GR coding sequence; insect cell systems infected with recombinant virus 
expression vectors (e.g., baculovirus) containing an NR, SR or GR coding 
20 sequence; plant cell systems infected with recombinant virus expression vectors 
(e.g., cauliflower mosaic virus, CaMV; tobacco mosaic virus, TMV) or transformed 
with recombinant plasmid expression vectors (e.g., Ti plasmid) containing an NR, 
SR or GR coding sequence; or animal cell systems. The expression elements of 
these systems vary in their strength and specificities. Methods for constructing 
25 expression vectors that comprise a partial or the entire native or mutated NR and 
GR polypeptide coding sequence and appropriate transcriptional/translational 
control signals include in vitro recombinant DNA techniques, synthetic techniques 
and in vivo recombination/genetic recombination. See, for example, the 
techniques described throughout Sambrook et al. . (1989) Molecular Cloning: A 
30 Laboratory Manual. Cold Spring Harbor Laboratory, New York, and Ausubel et al. , 
< 1989 ) Current Pro tocols in Molecular Biology . Greene Publishing Associates and 
Wiley Interscience, New York, both incorporated herein in their entirety. 
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Depending on the host/vector system utilized, any of a number of suitable 
transcription and translation elements, including constitutive and inducible 
promoters, can be used in the expression vector. For example, when cloning in 
bacterial systems, inducible promoters such as pL of bacteriophage A,, plac, ptrp, 
5 ptac (ptrp-lac hybrid promoter) and the like can be used. When cloning in insect 
cell systems, promoters such as the baculovirus polyhedrin promoter can be used. 
When cloning in plant cell systems, promoters derived from the genome of plant 
cells, such as heat shock promoters; the promoter for the small subunit of 
RUBISCO; the promoter for the chlorophyll a/b binding protein) or from plant 

10 viruses (e.g., the 35S RNA promoter of CaMV; the coat protein promoter of TMV) 
can be used. When cloning in mammalian cell systems, promoters derived from 
the genome of mammalian cells (e.g., metallothionein promoter) or from 
mammalian viruses (e.g., the adenovirus late promoter; the vaccinia virus 7.5K 
promoter) can be used. When generating cell lines that contain,multiple copies of 

15 the tyrosine kinase domain DNA, SV40-, BPV- and EBV-based vectors can be 
used with an appropriate selectable marker. 

Adequate levels of expression of nuclear receptor LBDs can be obtained by 
the novel approaches described herein. High level expression in E. coli of ligand 
binding domains of TR and other nuclear receptors, including members of the 

20 steroid/thyroid receptor superfamily, such as the estrogen (ER), androgen (AR), 
mineralocorticoid (MR), progesterone (PR), RAR, RXR and vitamin D (VDR) 
receptors can also be achieved after review of the expression of a soluble GR 
polypeptide in bacteria, more preferably, E. coli disclosed herein. The GR 
polypeptides of the present invention, disclosed herein, can thus now provide a 

25 variety of host-expression vector systems. Yeast and other eukaryotic expression 
systems can be used with nuclear receptors that bind heat shock proteins since 
these nuclear receptors are generally more difficult to express in bacteria, with the 
exception of ER, which can be expressed in bacteria. In a preferred embodiment 
of the present invention, as disclosed in the Examples, a GR LBD is expressed in 

30 E. coli. 

Representative nuclear receptors or their ligand binding domains have 
been cloned and sequenced, including human RARa, human RARy, human 
RXRa, human RXRp, human PPARa, human PPARp or 6 (delta), human PPARy, 
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human VDR, human ER (as described in Seielstad et al. . (1995) Mol. Endocrinol. 
9: 647-658), human GR, human PR, human MR, and human AR. The ligand 
binding domain of each of these nuclear receptors has been identified. Using this 
information in conjunction with the methods described herein, one of ordinary skill 
in the art can express and purify LBDs of any of the nuclear receptors, bind it to 
an appropriate ligand, and crystallize the nuclear receptor's LBD with a bound 
ligand, if desired. 

Extracts of expressing cells are a suitable source of receptor for purification 
and preparation of crystals of the chosen receptor. To obtain such expression, a 
vector can be constructed in a manner similar to that employed for expression of 
the rat TR alpha (Apriletti et al., (1995) Protein Expression and Purification, 6: 
368-370). The nucleotides encoding the amino acids encompassing the ligand 
binding domain of the receptor to be expressed can be inserted into an expression 
vector such as the one employed by Apriletti et al (1995). Stretches of adjacent 
amino acid sequences can be included if more structural information is desired. 

The native and mutated nuclear receptors in general, and more particularly 
SR and GR polypeptides, and fragments thereof, of the present invention can also 
be chemically synthesized in whole or part using techniques that are known in the 
art (See, e.g. , Creighton, (1983) Proteins: Structures and Molecular Principles . 
W.H. Freeman & Co., New York, incorporated herein in its entirety). 

In a preferred embodiment, the present invention provides for the first time 
for the expression of a soluble GR polypeptide in bacteria, more preferably, E. 
coli, and subsequent purification thereof. Representative purification techniques 
are also disclosed in the Examples, particularly Example 1. The GR polypeptides 
of the present invention, disclosed herein, can thus now provide the ability to 
employ additional purification techniques for both liganded and unliganded NRs. 
Thus, it is envisioned, based upon the disclosure of the present invention, that 
purification of the unliganded or liganded NR, SR or GR receptor can be obtained 
by conventional techniques, such as hydrophobic interaction chromatography 
(HPLC), ion exchange chromatography (HPLC), and heparin affinity 
chromatography. To achieve higher purification for improved crystals of nuclear 
receptors it is sometimes preferable to ligand shift purify the nuclear receptor 
using a column that separates the receptor according to charge, such as an ion 
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exchange or hydrophobic interaction column, and then bind the eluted receptor 
with a ligand. The ligand induces a change in the receptor's surface charge such 
that when re-chromatographed on the same column, the receptor then elutes at 
the position of the liganded receptor and is removed by the original column run 
5 with the unliganded receptor. Typically, saturating concentrations of ligand can be 
used in the column and the protein can be preincubated with the ligand prior to 
passing it over the column. 

More recently developed methods involve engineering a "tag" such as with 
histidine placed on the end of the protein, such as on the amino terminus, and 
10 then using a nickel chelation column for purification. See Janknecht , (1991) Proc. 
Natl. Acad. Sci. U.S.A. 88: 8972-8976 (1991), incorporated by reference. 

VII. Formation of NR, SR and GR Ligand Binding Domain Crystals 

In one embodiment, the present invention provides crystals of GRaLBD. 

15 The crystals were obtained using the methodology disclosed in the Laboratory 
Examples. The GRa LBD crystals, which can be native crystals, derivative 
crystals or co-crystals, have hexagonal unit cells (a hexagonal unit cell is a unit 
cell wherein a = b * c, and wherein a = p = 90°, and y = 120°) and space group 
symmetry P61. There are two GRa LBD molecule in the asymmetric unit. In this 

20 GRa crystalline form, the unit cell has dimensions of a = b =126.014 A, c = 86.312 
A, and a = p = 90°, and y = 120°. This crystal form can be formed in a 
crystallization reservoir as described in the Examples. 

VILA . Preparation of NR, SR and GR Crystals 

25 The native and derivative co-crystals, and fragments thereof, disclosed in 

the present invention can be obtained by a variety of techniques, including batch, 
liquid bridge, dialysis, vapor diffusion and hanging drop methods ( See, e.g. , 
McPherson , (1982) Preparation and Analysis of Protein Crystals , John Wiley, New 
York; McPherson , (1990) Eur. J. Biochem. 189:1-23; Weber , (1991) Adv. Protein 

30 Chem. 41:1-36). In a preferred embodiment, the vapor diffusion and hanging drop 
methods are used for the crystallization of NR, SR and GR polypeptides and 
fragments thereof. A more preferred hanging drop method technique is disclosed 
in the Examples. 
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In general, native crystals of the present invention are grown by dissolving 
substantially pure NR, SR or GR polypeptide or a fragment thereof in an aqueous 
buffer containing a precipitant at a concentration just below that necessary to 
precipitate the protein. Water is removed by controlled evaporation to produce 
5 precipitating conditions, which are maintained until crystal growth ceases. 

In one embodiment of the invention, native crystals are grown by vapor 
diffusion ( See, e.g. , McPherson, (1982) Preparation and Analysis of Protein 
Crystals, John Wiley, New York.; McPherson . (1990) Eur. J. Biochem. 189:1-23). 
In this method, the polypeptide/precipitant solution is allowed to equilibrate in a 
10 closed container with a larger aqueous reservoir having a precipitant 
concentration optimal for producing crystals. Generally, less than about 25 nL of 
NR, SR or GR polypeptide solution is mixed with an equal volume of reservoir 
solution, giving a precipitant concentration about half that required for 
crystallization. This solution is suspended as a droplet underneath a coverslip 
which is sealed onto the top of the reservoir. The sealed container is allowed to 
stand, until crystals grow. Crystals generally form within two to six weeks, and are 
suitable for data collection within approximately seven to ten weeks. Of course, 
those of skill in the art will recognize that the above-described crystallization 
procedures and conditions can be varied. 



15 
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VILB. Preparation of Derivative Crystals 

Derivative crystals of the present invention, e.g. heavy atom derivative 
crystals, can be obtained by soaking native crystals in mother liquor containing 
salts of heavy metal atoms. Such derivative crystals are useful for phase analysis 
in the solution of crystals of the present invention. In a preferred embodiment of 
the present invention, for example, soaking a native crystal in a solution 
containing methyl-mercury chloride provides derivative crystals suitable for use as 
isomorphous replacements in determining the X-ray crystal structure of a NR, SR 
or GR polypeptide. Additional reagents useful for the preparation of the derivative 
crystals of the present invention will be apparent to those of skill in the art after 
review of the disclosure of the present invention presented herein. 
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VII. C. Preparation of Co-crystals 

Co-crystals of the present invention can be obtained by soaking a native 
crystal in mother liquor containing compounds known or predicted to bind the LBD 
of a NR, SR or GR, or a fragment thereof. Alternatively, co-crystals can be 
5 obtained by co-crystallizing a NR, SR or GR LBD polypeptide or a fragment 
thereof in the presence of one or more compounds known or predicted to bind the 
polypeptide. In a preferred embodiment, as disclosed in the Examples, such a 
compound is dexamethasone. 

10 VII. D. Solving a Crystal Structure of the Present Invention 

Crystal structures of the present invention can be solved using a variety of 
techniques including, but not limited to, isomorphous replacement, anomalous 
scattering or molecular replacement methods. Computer software packages are 
also helpful in solving a crystal structure of the present invention. Applicable 

15 software packages include but are not limited to the CCP4 package disclosed in 
the Examples, the X-PLOR™ program ( Briinger , (1992) X-PZ.OR Version 3.1. A 
System for X-ray Crystallography and NMR, Yale University Press, New Haven, 
Connecticut; X-PLOR is available from Molecular Simulations, Inc., San Diego, 
California), Xtal View ( McRee , (1992) J. Mol. Graphics 10: 44-46; X-tal View is 

20 available from the San Diego Supercomputer Center). SHELXS 97 ( Sheldrick 
(1990) Acta Cryst A46: 467; SHELX 97 is available from the Institute of Inorganic 
Chemistry, Georg-August-Universitat, Gottingen, Germany), HEAVY (Terwilliger, 
Los Alamos National Laboratory) and SHAKE-AND-BAKE ( Hauptman , (1997) 
Curr. Opin. Struct Biol. 7: 672-80; Weeks et al. , (1993) Acta Cryst D49: 179; 

25 available from the Hauptman-Woodward Medical Research Institute, Buffalo, New 
York) can be used. See also , Ducruix & Geige , (1992) Crystallization of Nucleic 
Acids and Proteins: A Practical Approach , IRL Press, Oxford, England, and 
references cited therein. 

30 VIII. Characterization and Solution of a GRa Liqand Binding Domain Crystal 

Referring now to Figure 3A, the overall arrangement of the GR LBD dimer 
is depicted in a ribbon/worm diagram that was derived from the crystalline 
polypeptide of the present invention. The two GR LBDs are shown in white and 
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gray worm representation. The TIF2 peptides TIF2 are shown in gray ribbon and 
two dexamethasone ligands DEX are shown in space filling. The N terminus and 
C terminus of each GR LBD are labeled with a C and N, respectively. There is an 
interface between the two LBDs at beta turns and beta strands. 

Referring now to Figures 3B and 3C, two orientations of the GR/TIF2/DEX 
complex are depicted. In each figure, the TIF2 peptide TIF2 is shown in ribbon 
and the GR LBD is shown in worm. The AF2 helix AF2 of GR is shown in gray 
worm in each figure. The key structural elements helix 9 H9 and helix 3 H3 are 
indicated, as is the N terminus N. The DEX compound DEX is shown in dark gray 
shading. In Figures 3B and 3C. the interaction of helix 3 H3 and the AF2 helix 
AF2 with dexamethasone DEX can be seen. 

Referring now to Figures 4A and 4B, the overlap of GR LBD with the LBDs 
of the AR and PR (Figures 4A and 4B, respectively) is depicted. The AR and PR 
are shown as a thin line, while the GR is shown as a thick line. Backbone Calpha 
atoms are also shown. This superposition is consistent with the sequence 
alignment approach taken in the design of the GR LBD polypeptide disclosed 
herein. 

RMS deviation calculation results were as follows: 



15 



0 



AR 1.56 



GR PR AR 

.56 
1.34 



GR 000 0.94 

PR 0.94 o.OO 



1-34 0.00 



where in each of the three calculations, the RMS deviation was computed 
using 980 N, backbone C alpha, C, O atoms from 245 aligned residues. These 
245 residues are GR:531-775. PR:686-987,899-931 and AR:672-883,885-917. 
Several GR and PR residues before helix-1 were omitted in the calculations, as 
was one residue at the C-terminus, to correspond to the shorter AR construct. 
One residue (PR:898 and AR:884) was also omitted in the 10-AF2 loop because 
of the deletion in GR. The RMS deviations suggest that the AR structure has 
diverged away from GR and PR, and graphical examination confirmed this at least 
qualitatively. 
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Referring now to Figure 5, a sequence alignment of steroid receptors, 
particularly an alignment of the F602S GRa sequence with MR, PR, AR, ERa, and 
Erp is depicted. Residues that lie within 5.0 angstroms of the ligand are identified 
with small square boxes around the one-letter amino acid code. The ligands used 
5 for this calculation are dexamethasone (for GR), progesterone (for PR), 
dihydrotestosterone (for AR), estradiol (for ERa) and genistein (for ERp). The 
alpha-helices and beta-strands observed in the X-ray structures are identified by 
the larger boxes and captions. Note that the secondary structure of MR is not 
publicly known at this time, and is thus not annotated in the Figure. More than 

10 one structure is available for PR, AR, ERa and ERp, and, in some cases, the 
alpha-helices have different endpoints in these different X-ray structures. The 
variation in the alpha-helices is indicated here by using boxes with thicker and 
thinner linewidths, where the thicker linewidth box encompasses residues that 
adopt the same secondary structure in all available X-ray structures, and thinner 

1 5 linewidth boxes encompass residues that adopt an alpha-helical structure in some 
but not all X-ray structures. The secondary structures were determined by 
graphical examination of the X-ray structures. 

It is also noted that, within the ligand binding domains (LBDs), the 
sequence identity is as follows: 

20 

Table 1 

Sequence Identity of NR LBDs 
GR MR PR AR 

GR 100% 56% 54% 50% 

25 MR 56% 100% 55% 51% 

PR 54% 55% 100% 55% 

AR 50% 51% 55% 100% 

VIH.A Unique Structural Differences Between GRa and Other SRs 
30 Even though the GR LBD shares over 50% sequence identity with PR and 

AR and fold into a similar three-layer helical sandwich (Figure 4A and 4B), there 
are a number of unique structural differences in their structures. The most distinct 
differences are noted in the extended strand between helices 1 and 3, and the 
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position of helix 7. These differences contribute a unique shape of the binding 
pocket for each receptor (Figures 6A and 6B) and may thus provide a molecular 
basis for steroid specificity of these receptors. The detailed structural information 
about the GR LBD and the pocket provided herein can be further exploited to 
5 design receptor specific agonists or antagonists. 

VIH-B Dexamethasone 

The ligand binding domain of GRa was co-crystallized with 
dexamethasone, which has the IUPAC name (11p, 1 6<x)-9-fluoro-1 ip, 17,21 - 
1 0 trihydroxy-1 6a-methylpregna-1 -4-diene-3,20-dione and is shown below. 

H 2 C-OH 




O 

Dexamethasone is an agonist of GRa and is useful for treatment of 
GRoc-mediated diseases or conditions including inflammation, tissue rejection, 

15 auto-immunity, malignancies such as leukemias and lymphomas, Cushing's 
syndrome, acute adrenal insufficiency, congenital adrenal hyperplasia, rheumatic 
fever, polyarteritis nodosa, granulomatous polyarteritis, inhibition of myeloid cell 
lines, immune proliferation/apoptosis, HPA axis suppression and regulation, 
hypercortisolemia, modulation of the Thimi2 cytokine balance, chronic kidney 

20 disease, stroke and spinal cord injury, hypercalcemia, hypergylcemia, acute 
adrenal insufficiency, chronic primary adrenal insufficiency, secondary adrenal 
insufficiency, congenital adrenal hyperplasia, cerebral edema, thrombocytopenia, 
and Little's syndrome as well as many other conditions. 



25 
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VIII. C. Characterization of the GRcc Binding Pocket and Interactions 

Between GRct and Dexamethasone 
Referring now to Figure 6A, the GR ligand binding pocket is depicted 
schematically. The GR ligand binding pocket is shown in a worm representation 
5 and the pocket is shown with a white surface. The gross shape of the binding 
pocket is depicted here with a smooth surface that covers the available volume 
within the binding pocket. The available volume is mapper 1 by placing the protein 
within a grid, and then checking, for each grid point, whether a spherical probe 
atom can fit at that point without bumping into the protein. The spacing of grid 

10 points was taken as 0.50A, and the radius of the probe atom was taken as 1.40A. 
Atoms in the protein were represented as spheres with a radius of 1.20A for 
hydrogen, 1.70A for carbon, 1.55A for nitrogen, 1.52A for oxygen and 1.80A for 
sulfur. These are esssentially the atomic radius values suggested by Bondi (A. 
Bondi, "van der Waals Volumes and Radii," Journal of Physical Chemistry, 68, 

15 441-451 (1964)). The protein was represented with all hydrogen atoms in order to 
handle its volume more accurately. These hydrogen atoms where added to obtain 
the protonation states expected at pH 7 using the MVP program. The MVP 
program adds hydrogens using standard geometry, and then refines the initial 
coordinates with energy minimization, holding all heavy atoms fixed. The 

20 "available" grid points are defined as those for which the probe sphere does not 
bump into any sphere corresponding to a protein atom. The smooth surface was 
then constructed over these available binding site grid points using the dot surface 
program of Connolly (Michael L. Connolly, "Solvent-Accessible Surfaces of 
Proteins and Nucleic Acids," Science 221, 709-713 (1983)) with a probe radius of 

25 1 .30A. The protein chain is shown with a backbone ribbon depiction. 

Referring now to Figure 6B, electron density in the GR-dexamethasone 
interface is depicted. The electron density is calculated with Fo coefficiency and 
shown in a one sigma cutoff. The ligand DEX is in the center of the figure. Key 
residues L732, A605, R611, Q570, G567, N564, and F749 encircle ligand DEX. 

30 Ligand DEX displays a good spatial fit, with no overlaps and no apparent charge 
repulsions. 

Referring now to Figure 7, molecular interactions between the GR protein 
and the dexamethasone are depicted. There are 22 residues from GR involved in 
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direct interactions with the dexamethasone, and the residues are Q570, L566, 
G567, L563, W600, L753, N564, F749, C736, 1747, M560, T739, Q642, Y735, 
L732, M646, M601, A605, F623, M604, L608, and R611. 

VIII.D. Structural Mechanis m of Improving Protein Solubility by the F602S 
Mutation 

Figure 8 is a wireframe diagram that provides a closer look at the F602S 
mutation. The F602 is lipophilic but resides in the hydrophilic environment, a 
situation that could destabilize the protein. The mutation of the phenylalanine (F) 
to the serine (S) allows the S602 side chain to make direct hydrogen bonds with 
two water molecules, shown as 1H 2 0 and 2H z O in Figure 8. Association 
distances of 2.416 and 4.036 are indicated between S602 and 1H 2 0 and 2H 2 0, 
respectively. Other residues are also shown in interaction with 1H 2 0 and 2H 2 O t 
and these include H726 (which is also coordinated with water molecule 1H 2 0), 
Y764 (which is also coordinated with water molecules 1H 2 0 and 2H 2 0), Y598 and 
W600. An association distance of 4.354 is shown between 1H 2 0 and H726; and 
an association distance of 3.286 is shown with Y764. An association distance of 
3.157 is shown between 1H 2 0 and 2H 2 0. It is envisioned that this complex 
hydrogen bond network initiated by the F602S mutation and the two water 
molecules improves the protein stability thus the solubility as well. 

yill.E. Generation of Easily-Solved NR, SR and GR Crystals 
The present invention discloses a substantially pure GR LBD polypeptide in 
crystalline form. In a preferred embodiment, exemplified in the Figures and 
Laboratory Examples, GRa is crystallized with bound ligand. Crystals can be 
formed from NR, SR and GR LBD polypeptides that are usually expressed by a 
cell culture, such as E. coli. Bromo- and iodo-substitutions can be included during 
the preparation of crystal forms and can act as heavy atom substitutions in GR 
ligands and crystals of NRs, SRs and GRs. This method can be advantageous for 
the phasing of the crystal, which is a crucial, and sometimes limiting, step in 
solving the three-dimensional structure of a crystallized entity. Thus, the need for 
generating the heavy metal derivatives traditionally employed in crystallography 
can be eliminated. After the three-dimensional structure of a NR, SR or GR, or an 
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NR, SR or GR LBD with or without a ligand bound is determined, the resultant 

three-dimensional structure can be used in computational methods to design 

synthetic ligands for NR, SR or GR and for other NR, SR or GR polypeptides. 

« 

Further activity structure relationships can be determined through routine testing, 
5 using assays disclosed herein and known in the art. 

IX. Uses of NR, SR and GR Crystals and the Three-Dimensional Structure of 
the Ligand Binding Domain of GRg 

The solved crystal structure of the present invention is useful in the design 
of modulators of activity mediated by the glucocorticoid receptor and by other 
nuclear receptors. Evaluation of the available sequence data shows that GRa is 
particularly similar to MR, PR and AR. The GRa LBD has approximately 55%, 
54% and 50% sequence identity to the MR, PR and AR LBDs, respectively. The 
GRp amino acid sequence is identical to the GRa amino acid sequence for 
residues 1-726, but the remaining 16 residues in GRp show no significant 
similarity to the remaining 51 residues in GRa. 

The present GRa X-ray structure can also be used to build models for 
targets where no X-ray structure is available, such as with GRp and MR. Indeed, 
a model for GRa using the available X-ray structures of PR and/or AR as 
templates was built and used by the present co-inventors to obtain a starting 
model for the molecular replacement calculation used in solving the X-ray 
structure of GRa disclosed herein. These models will be less accurate than X-ray 
structures, but can help in the design of compounds targeted for GRp and MR, for 
example. Also, these models can aid the design of compounds to selectively 
modulate any desired subset of GRa, GRp, MR, PR, AR and other related nuclear 
receptors. 

IX.A. Design and Development of NR, SR and GR Modulators 
The present invention, particularly the computational methods, can be used 
30 to design drugs for a variety of nuclear receptors, such as receptors for 
glucocorticoids (GRs), androgens (ARs), mineralocorticoids (MRs), progestins 
(PRs), estrogens (ERs), thyroid hormones (TRs), vitamin D (VDRs), retinoid 
(RARs and RXRs) and peroxisomal proliferators (PPARs). The present invention 
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can also be applied to the "orphan receptors." as they are structurally homologous 
in terms of modular domains and primary structure to classic nuclear receptors, 
such as steroid and thyroid receptors. The amino acid homologies of orphan 
receptors with other nuclear receptors ranges from very low (<15%) to in the 
range of 35% when compared to rat RARa and human TRB receptors, for 
example. 

The knowledge of the structure of the GRa ligand binding domain (LBD), 
an aspect of the present invention, provides a tool for investigating the mechanism 
of action of GRa and other NR, SR and GR polypeptides in a subject. For 
example, various computer modelleing programs, as described herein, can predict 
the binding of various ligand molecules to the LBD of GRp, or another steroid 
receptor or, more generally, nuclear receptor. Upon discovering that such binding 
in fact takes place, knowledge of the protein structure then allows design and 
synthesis of small molecules that mimic the functional binding of the ligand to the 
LBD of GRa, and to the LBDs of other polypeptides. This is the method of 
"rational" drug design, further described herein. 

Use of the isolated and purified GRa crystalline structure of the present 
invention in rational drug design is thus provided in accordance with the present 
invention. Additional rational drug design techniques are described in U.S. Patent 
20 Nos. 5,834,228 and 5,872,01 1 , incorporated herein in their entirety. 

Thus, in addition to the compounds described herein, other sterically similar 
compounds can be formulated to interact with the key structural regions of an NR, 
SR or GR in general, or of GRa in particular. The generation of a structural 
functional equivalent can be achieved by the techniques of modeling and 
chemical design known to those of skill in the art and described herein. It will be 
understood that all such sterically similar constructs fall within the scope of the 
present invention. 



15 



25 



30 



IX.A.1. Rational Drug Design 

The three-dimensional structure of ligand-binding GRa is unprecedented 
and will greatly aid in the development of new synthetic ligands for NR. SR and 
GR polypeptides, such as GR agonists and antagonists, including those that bind 
exclusively to any one of the GR subtypes. In addition, NRs, SRs and GRs are 
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well suited to modern methods, including three-dimensional structure elucidation 
and combinatorial chemistry, such as those disclosed in U.S. Patent Nos. 
5,463,564, and 6,236,946 incorporated herein by reference. Structure 
determination using X-ray crystallography is possible because of the solubility 
5 properties of NRs SRs and GRs. Computer programs that use crystallography 
data when practicing the present invention will enable the rational design of 
ligands to these receptors. 

Programs such as RASMOL (Biomolecular Structures Group, Glaxo 
Wellcome Research & Development Stevenage, Hertfordshire, UK Version 2.6, 

10 August 1995, Version 2.6.4, December 1998, Copyright © Roger Sayle 1992- 
1999) and Protein Explorer (Version 1.87, July 3, 2001, © Eric Martz, 2001 and 
available online at http://www.umass.edu/microbio/chime/explorer/index.htm) can 
be used with the atomic structural coordinates from crystals generated by 
practicing the invention or used to practice the invention by generating three- 

15 dimensional models and/or determining the structures involved in ligand binding. 
Computer programs such as those sold under the registered trademark INSIGHT 
II® and the programs GRASP ( Nicholls et aL , (1991) Proteins 11: 281) and 
SYBYL™ (available from Tripos, Inc. of St. Louis, Missouri) allow for further 
manipulations and the ability to introduce new structures. In addition, high 

20 throughput binding and bioactivity assays can be devised using purified 
recombinant protein and modem reporter gene transcription assays known to 
those of skill in the art in order to refine the activity of a designed ligand. 

A method of identifying modulators of the activity of an NR, SR or GR 
polypeptide using rational drug design is thus provided in accordance with the 

25 present invention. The method comprises designing a potential modulator for an 
NR, SR or GR polypeptide of the present invention that will form non-covalent 
interactions with amino acids in the ligand binding pocket based upon the 
crystalline structure of the GRa LBD polypeptide; synthesizing the modulator; and 
determining whether the potential modulator modulates the activity of the NR, SR 

30 or GR polypeptide. In a preferred embodiment, the modulator is designed for an 
SR polypeptide. In a more preferred embodiment, the modulator is designed for a 
GRa polypeptide. Preferably, the GRa polypeptide comprises the amino acid 
sequence of any of SEQ ID NOs:2, 4, 6 and 8, and more preferably, the GRa LBD 
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comprises the amino acid sequence of any of SEQ ID NOs:10, 12, 14, 16 and 31. 
The determination of whether the modulator modulates the biological activity of an 
NR, SR or GR polypeptide is made in accordance with the screening methods 
disclosed herein, or by other screening methods known to those of skill in the art. 
Modulators can be synthesized using techniques known to those of ordinary skill 
in the art. 

In an alternative embodiment, a method of designing a modulator of an NR, 
SR or GR polypeptide in accordance with the present invention is disclosed 
comprising: (a) selecting a candidate NR, SR or GR ligand; (b) determining which 
amino acid or amino acids of an NR, SR or GR polypeptide interact with the ligand 
using a three-dimensional model of a crystallized GRa LBD; (c) identifying in a 
biological assay for NR, SR or GR activity a degree to which the ligand modulates 
the activity of the NR, SR or GR polypeptide; (d) selecting a chemical modification 
of the ligand wherein the interaction between the amino acids of the NR. SR or 
GR polypeptide and the ligand is predicted to be modulated by the chemical 
modification; (e) synthesizing a chemical compound with the selected chemical 
modification to form a modified ligand; (f) contacting the modified ligand with the 
NR, SR or GR polypeptide; (g) identifying in a biological assay for NR, SR or GR 
activity a degree to which the modified ligand modulates the biological activity of 
the NR, SR or GR polypeptide; and (h) comparing the biological activity of the NR, 
SR or GR polypeptide in the presence of modified ligand with the biological 
activity of the NR, SR or GR polypeptide in the presence of the. unmodified ligand, 
whereby a modulator of an NR, SR or GR polypeptide is designed. 

An additional method of designing modulators of an NR, SR or GR or an 
NR, SR or GR LBD can comprise: (a) determining which amino acid or amino 
acids of an NR, SR or GR LBD interacts with a first chemical moiety (at least one) 
of the ligand using a three dimensional model of a crystallized protein comprising 
an NR. SR or GR LBD in complex with a bound ligand and a co-activator; and (b) 
selecting one or more chemical modifications of the first chemical moiety to 
produce a second chemical moiety with a structure to either decrease or increase 
an interaction between the interacting amino acid and the second chemical moiety 
compared to the interaction between the interacting amino acid and the first 
chemical moiety. This is a general strategy only, however, and variations on this 



MSDOCID: <WO__0301 5692A2_I_> 



WO 03/015692 PCT/US02/22648 

-58- 

disclosed protocol would be apparent to those of skill in the art upon consideration 
of the present disclosure. 

Once a candidate modulator is synthesized as described herein and as will 
be known to those of skill in the art upon contemplation of the present invention, it 
5 can be tested using assays to establish its activity as an agonist, partial agonist or 
antagonist, and affinity, as described herein. After such testing, a candidate 
modulator can be further refined by generating LBD crystals with the candidate 
modulator bound to the LBD. The structure of the candidate modulator can then 
be further refined using the chemical modification methods described herein for 
10 three dimensional models to improve the activity or affinity of the candidate 
modulator and make second generation modulators with improved properties, 
such as that of a super agonist or antagonist, as described herein. 

IX.A.2. Methods for Using the GRa LBD Structural Coordinates For 

15 Molecular Design 

For the first time, the present invention permits the use of molecular design 
techniques to design, select and synthesize chemical entities and compounds, 
including modulatory compounds, capable of binding to the ligand binding pocket 
or an accessory binding site of an NR, SR or GR and an NR, SR or GR LBD, in 
20 whole or in part. Correspondingly, the present invention also provides for the 
application of similar techniques in the design of modulators of any NR, SR or GR 
polypeptide. 

In accordance with a preferred embodiment of the present invention, the 
structure coordinates of a crystalline GRa LBD can be used to design compounds 

25 that bind to a GR LBD (more preferably a GRa LBD) and alter the properties of a 
GR LBD (for example, the dimerization ability, ligand binding ability or effect on 
transcription) in different ways. One aspect of the present invention provides for 
the design of compounds that can compete with natural or engineered ligands of a 
GR polypeptide by binding to all, or a portion of, the binding sites on a GR LBD. 

30 The present invention also provides for the design of compounds that can bind to 
all, or a portion of, an accessory binding site on a GR that is already binding a 
ligand. Similarly, non-competitive agonists/ligands that bind to and modulate GR 
LBD activity, whether or not it is bound to another chemical entity, and partial 
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agonists and antagonists can be designed using the GR LBD structure 
coordinates of this invention. 

A second design approach is to probe an NR, SR or GR or an NR, SR or 
GR LBD (preferably a GRa or GRa LBD) crystal with molecules comprising a 
variety of different chemical entities to determine optimal sites for interaction 
between candidate NR, SR or GR or NR, SR or GR LBD modulators and the 
polypeptide. For example, high resolution X-ray diffraction data collected from 
crystals saturated with solvent allows the determination of the site where each 
type of solvent molecule adheres. Small molecules that bind tightly to those sites 
can then be designed and synthesized and tested for their an NR, SR or GR 
modulator activity. Representative designs are also disclosed in published PCT 
application WO 99/26966. 

Once a computationally-designed ligand is synthesized using the methods 
of the present invention or other methods known to those of skill in the art. assays 
can be used to establish its efficacy of the ligand as a modulator of NR, SR or GR 
(preferably GRa) activity. After such assays, the ligands can be further refined by 
generating intact NR, SR or GR, or NR, SR or GR LBD, crystals with a ligand 
bound to the LBD. The structure of the ligand can then be further refined using 
the chemical modification methods described herein and known to those of skill in 
the art, in order to improve the modulation activity or the binding affinity of the 
ligand. This process can lead to second generation ligands with improved 
properties. 

Ligands also can be selected that modulate NR, SR or GR responsive 
gene transcription by the method of altering the interaction of co-activators and 
co-repressors with their cognate NR, SR or GR. For example, agonistic ligands 
can be selected that block or dissociate a co-repressor from interacting with a GR, 
and/or that promote binding or association of a co-activator. Antagonistic ligands 
can be selected that block co-activator interaction and/or promote co-repressor 
interaction with a target receptor. Selection can be done via binding assays that 
screen for designed ligands having the desired modulatory properties. Preferably, 
interactions of a GRa polypeptide are targeted. A suitable assay for screening 
that can be employed, mutatis mutandis in the present invention, as described in 
Oberfield, J.L.. et al., Proc Natl Acad Sci USA. (1999) May 25; 96(1 1):61 02-6. 
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incorporated herein in its entirety by reference. Other examples of suitable 
screening assays for GR function include an in vitro peptide binding assay 
representing ligand-induced interaction with coactivator (Zhou, et al., (1998) Mol. 
Endocrinol. 12: 1594-1604; Parks et al. f (1999) Science 284: 1365-1368) or a cell- 
5 based reporter assay related to transcription from a GRE (reviewed in Jenkins et 
al., (2001) Trends Endocrinol. Metab. 12: 122-126) or a cell-based reporter assay 
related to repression of genes driven via NF-kB. DeBosscher et al., (2000) Proc 
Natl Acad Sci USA. 97: 3919-3924. 

10 IX.A.3. Methods of Designing NR, SR or GR LBD Modulator 

Compounds 

Knowledge of the three-dimensional structure of the GR LBD complex of 
the present invention can facilitate a general model for modulator (e.g. agonist, 
partial agonist, antagonist and partial antagonist) design. Other ligand-receptor 

15 complexes belonging to the nuclear receptor superfamily can have a ligand 
binding pocket similar to that of GR and therefore the present invention can be 
employed in agonist/antagonist design for other members of the nuclear receptor 
superfamily and the steroid receptor subfamily. Examples of suitable receptors 
include those of the NR superfamily and those of the SR subfamily. 

20 The design of candidate substances, also referred to as "compounds" or 

"candidate compounds", that bind to or inhibit NR, SR or GR LBD-mediated 
activity according to the present invention generally involves consideration of two 
factors. First, the compound must be capable of physically and structurally 
associating with a NR, SR or GR LBD. Non-covalent molecular interactions 

25 important in the association of a NR, SR or GR LBD with its substrate include 
hydrogen bonding, van der Waals interactions and hydrophobic interactions. 

The interaction between an atom of a LBD amino acid and an atom of an 
LBD ligand can be made by any force or attraction described in nature. Usually 
the interaction between the atom of the amino acid and the ligand will be the result 

30 of a hydrogen bonding interaction, charge interaction, hydrophobic interaction, 
van der Waals interaction or dipole interaction. In the case of the hydrophobic 
interaction it is recognized that this is not a per se interaction between the amino 
acid and ligand, but rather the usual result, in part, of the repulsion of water or 
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other hydrophilic group from a hydrophobic surface. Reducing or enhancing the 
interaction of the LBD and a ligand can be measured by calculating or testing 
binding energies, computationally or using thermodynamic or kinetic methods as 
known in the art. 

Second, the compound must be able to assume a conformation that allows 
it to associate with a NR, SR or GR LBD. Although certain portions of the 
compound will not directly participate in this association with a NR, SR or GR 
LBD, those portions can still influence the overall conformation of the molecule. 
This, in turn, can have a significant impact on potency. Such conformational 
requirements include the overall three-dimensional structure and orientation of the 
chemical entity or compound in relation to all or a portion of the binding site, e.g., 
the ligand binding pocket or an accessory binding site of a NR, SR or GR LBD, or 
the spacing between functional groups of a compound comprising several 
chemical entities that directly interact with a NR, SR or GR LBD. 

Chemical modifications will often enhance or reduce interactions of an 
atom of a LBD amino acid and an atom of an LBD ligand. Steric hinderance can 
be a common means of changing the interaction of a LBD binding pocket with an 
activation domain. Chemical modifications are preferably introduced at C-H, C- 
and C-OH positions in a ligand, where the carbon is part of the ligand structure 
that remains the same after modification is complete. In the case of C-H, C could 
have 1, 2 or 3 hydrogens, but usually only one hydrogen will be replaced. The H 
or OH can be removed after modification is complete and replaced with a desired 
chemical moiety. 

The potential modulatory or binding effect of a chemical compound on a 
NR, SR or GR LBD can be analyzed prior to its actual synthesis and testing by the 
use of computer modeling techniques that employ the coordinates of a crystalline 
GRa LBD polypeptide of the present invention. If the theoretical structure of the 
given compound suggests insufficient interaction and association between it and a 
NR, SR or GR LBD, synthesis and testing of the compound is obviated. However, 
if computer modeling indicates a strong interaction, the molecule can then be 
synthesized and tested for its ability to bind and modulate the activity of a NR. SR 
or GR LBD. In this manner, synthesis of unproductive or inoperative compounds 
can be avoided. 
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A modulatory or other binding compound of a NR, SR or GR LBD 
polypeptide (preferably a GRa LBD) can be computationally evaluated and 
designed via a series of steps in which chemical entities or fragments are 
screened and selected for their ability to associate with an individual binding site 
5 or other area of a crystalline GRa LBD polypeptide of the present invention and to 
interact with the amino acids disposed in the binding sites. 

Interacting amino acids forming contacts with a ligand and the atoms of the 
interacting amino acids are usually 2 to 4 angstroms away from the center of the 
atoms of the ligand. Generally these distances are determined by computer as 

10 discussed herein and in McRee (McRee, (1993) Practical Protein Crystallography , 
Academic Press, New York), however distances can be determined manually 
once the three dimensional model is made. More commonly, the atoms of the 
ligand and the atoms of interacting amino acids are 3 to 4 angstroms apart. A 
ligand can also interact with distant amino acids, after chemical modification of the 

1 5 ligand to create a new ligand. Distant amino acids are generally not in contact with 
the ligand before chemical modification. A chemical modification can change the 
structure of the ligand to make as new ligand that interacts with a distant amino 
acid usually at least 4.5 angstroms away from the ligand. Often distant amino 
acids will not line the surface of the binding cavity for the ligand, as they are too 

20 far away from the ligand to be part of a pocket or surface of the binding cavity. 

A variety of methods can be used to screen chemical entities or fragments 
for their ability to associate with an NR, SR or GR LBD and, more particularly, with 
the individual binding sites of an NR, SR or GR LBD, such as ligand binding 
pocket or an accessory binding site. This process can begin by visual inspection 

25 of, for example, the ligand binding pocket on a computer screen based on the 
GRa LBD atomic coordinates in Table 4, as described herein. Selected 
fragments or chemical entities can then be positioned in a variety of orientations, 
or docked, within an individual binding site of a GRa LBD as defined herein 
above. Docking can be accomplished using software programs such as those 

30 available under the tradenames QUANTA™ (Molecular Simulations Inc., San 
Diego, California) and SYBYL™ (Tripos, Inc., St. Louis, Missouri), followed by 
energy minimization and molecular dynamics with standard molecular mechanics 
forcefields, such as CHARM ( Brooks et al. , (1983) J. Comp. Chem., 8: 132) and 
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AMBER 5 (Case et al. , (1997), AMBER 5, University of California, San Francisco; 
Peariman et al., (1995) Comput. Phys. Commun. 91: 1-41). 

Specialized computer programs can also assist in the process of selecting 
fragments or chemical entities. These include: 

1. GRID^ program, version 17 (Goodford , (1985) J. Med. Chem. 28: 849- 
57), which is available from Molecular Discovery Ltd., Oxford, UK; 

2. MCSS™ program ( Miranker & Karp lus. (1991) Proteins 11: 29-34), 
which is available from Molecular Simulations, Inc., San Diego, California; 

3. AUTODOCK™ 3.0 program ( Goodsell & Olsen . (1990) Proteins 8: 195- 
202), which is available from the Scripps Research Institute, La Jolla, California; 

4. DOCK™ 4.0 program (Kuntz et al. . (1992) J. Mol. Biol. 161: 269-88), 
which is available from the University of California, San Francisco, California; 

5. FLEX-X™ program (See, Rareyetal. , (1996) J. Comput. Aid. Mol. Des. 
10:41-54), which is available from Tripos, Inc., St. Louis, Missouri; 

15 6. MVP program (Lambert, (1997) in Practical Application of Com puter- 

Aided Drug Design, (Charifson, ed.) Marcel-Dekker, New York, pp. 243-303); and 
7. LUDI™ program ( Bohm , (1992) J. Comput. Aid. Mol. Des., 6: 61-78), 
which is available from Molecular Simulations, Inc., San Diego, California. 

Once suitable chemical entities or fragments have been selected, they can 
be assembled into a single compound or modulator. Assembly can proceed by 
visual inspection of the relationship of the fragments to each other on the three- 
dimensional image displayed on a computer screen in relation to the structure 
coordinates of a GRa LBD. Manual model building using software such as 
QUANTA™ or SYBYL™ typically follows. 

Useful programs to aid one of ordinary skill in the art in connecting the 
individual chemical entities or fragments include: 

1. CAVEAT™ program ( Bartlett et al. . (t989) Special Pub., Royal Chem. 
Soc. 78: 182-96), which is available from the University of California, Berkeley, 
California; 

2. 3D Database systems, such as MACCS-3D™ system program, which is 
available from MDL Information Systems, San Leandro, California. This area is 
reviewed in Martin , (1992) J. Med. Chem. 35: 2145-54; and 
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3. HOOK™ program ( Eisen et aL , (1994). Proteins 19: 199-221), which is 
available from Molecular Simulations, Inc., San Diego, California. 

Instead of proceeding to build a GR LBD modulator (preferably a GRa LBD 
modulator) in a step-wise fashion one fragment or chemical entity at a time as 
described above, modulatory or other binding compounds can be designed as a 
whole or de novo using the structural coordinates of a crystalline GRa LBD 
polypeptide of the present invention and either an empty binding site or optionally 
including some portion(s) of a known modulator(s). Applicable methods can 
employ the following software programs: 

1. LUDI™ program ( Bohm , (1992) J. Comput Aid. Moi Des., 6: 61-78), 
which is available from Molecular Simulations, Inc., San Diego, California; 

2. LEGEND™ program ( Nishibata & Itai , (1991) Tetrahedron 47: 8985); 

and 

3. LEAPFROG™, which is available from Tripos Associates, St. Louis, 
Missouri. 

Other molecular modeling techniques can also be employed in accordance 
with this invention. See, e.g. , Cohen et al. , (1990) J. Med. Chem. 33: 883-94. 
See also , Navia & Murcko , (1992) Curr. Opin. Struc. Biol. 2: 202-10; U.S. Patent 
No. 6,008,033, herein incorporated by reference. 

Once a compound has been designed or selected by the above methods, 
the efficiency with which that compound can bind to a NR, SR or GR LBD can be 
tested and optimized by computational evaluation. By way of particular example, 
a compound that has been designed or selected to function as a NR, SR or GR 
LBD modulator should also preferably traverse a volume not overlapping that 
occupied by the binding site when it is bound to its native ligand. Additionally, an 
effective NR, SR or GR LBD modulator should preferably demonstrate a relatively 
small difference in energy between its bound and free states (i.e., a small 
deformation energy of binding). Thus, the most efficient NR, SR and GR LBD 
modulators should preferably be designed with a deformation energy of binding of 
not greater than about 10 kcal/mole, and preferably, not greater than 7 kcal/mole. 
It is possible for NR, SR and GR LBD modulators to interact with the polypeptide 
in more than one conformation that is similar in overall binding energy. In those 
cases, the deformation energy of binding is taken to be the difference between the 
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energy of the free compound and the average energy of the conformations 
observed when the modulator binds to the polypeptide. 

A compound designed or selected as binding to an NR, SR or GR 
polypeptide (preferably a GRa LBD polypeptide) can be further computationally 
optimized so that in its bound state it would preferably lack repulsive electrostatic 
interaction with the target polypeptide. Such non-complementary (e.g., 
electrostatic) interactions include repulsive charge-charge, dipole-dipole and 
charge-dipole interactions. Specifically, the sum of all electrostatic interactions 
between the modulator and the polypeptide when the modulator is bound to an 
NR, SR or GR LBD preferably make a neutral or favorable contribution to the 
enthalpy of binding. 

Specific computer software is available in the art to evaluate compound 
deformation energy and electrostatic interaction. Examples of programs designed 
for such uses include: 

1. Gaussian 98™, which is available from Gaussian, Inc., Pittsburgh, 
Pennsylvania; 

2. AMBER™ program, version 6.0, which is available from the University 
of California at San Francisco; 

3. QUANTA™ program, which is available from Molecular Simulations, 
20 Inc., San Diego, California; 

4. CHARMm® program, which is available from Molecular Simulations, 
Inc., San Diego, California; and 

4. Insight II® program, which is available from Molecular Simulations, Inc., 
San Diego, California. 

These programs can be implemented using a suitable computer system. 
Other hardware systems and software packages will be apparent to those skilled 
in the art after review of the disclosure of the present invention presented herein. 

Once an NR, SR or GR LBD modulating compound has been optimally 
selected or designed, as described above, substitutions can then be made in 
some of its atoms or side groups in order to improve or modify its binding 
properties. Generally, initial substitutions are conservative, i.e., the replacement 
group will have approximately the same size, shape, hydrophobic^ and charge 
as the original group. It should, of course, be understood that components known 
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in the art to alter conformation are preferably avoided. Such substituted chemical 
compounds can then be analyzed for efficiency of fit to an NR, SR or GR LBD 
binding site using the same computer-based approaches described in detail 
above. 

5 

IX.B. Distinguishing Between GR Subtypes and Between NRs 
The present invention also is applicable to generating new synthetic ligands 
to distinguish nuclear receptor subtypes. As described herein, modulators can be 
generated that distinguish between subtypes, thereby allowing the generation of 

10 either tissue specific or function specific synthetic ligands. For instance, the GRa 
gene can be translated from its mRNA by alternative initiation from an internal 
ATG codon (Yudt & Cidlowski (2001) Molec. Endocrinol. 15: 1093-1103). This 
codon codes for methionine at position 27 and translation from this position 
produces a slightly smaller protein. These two isoforms, translated from the same 

15 gene, are referred to as GR-A and GR-B. It has been shown in a cellular system 
that the shorter GR-B form is more effective in initiating transcription from a GRE 
compared to GR-A. Additionally, another form of GR, called GRp is produced by 
an alternative splicing event. The GRp protein differs from GRa at the very C- 
terminus, where the final 50 amino acids are replaced with a 15 amino acid 

20 segment. These two isoforms are 100% identical up to amino acid 727. No 
sequence similarity exists between GRa and GRp at the C-terminus beyond 
position 727. GRp has been shown to be a dominant negative regulator of GRa- 
mediated gene transcription (Oakley, Sar & Cidlowski (1996) J. Biol. Chem. 271: 
9550-9559). It has been suggested that some of the tissue specific effects 

25 observed with glucocorticoid treatment may in part be due to the presence of 
varying amounts of isoform in certain cell-types. This method is also applicable to 
any other subfamily so organized. 

The present invention discloses the ability to generate new synthetic 
ligands to distinguish between GR subtypes. As described herein, computer- 

30 designed ligands (i.e. candidate modulators and modulators) can be generated 
that distinguish between GR subtypes, thereby allowing the generation of either 
tissue specific or function specific ligands. The atomic structural coordinates 
disclosed in the present invention reveal structural details unique to GRa. These 
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structural details can be exploited when a novel ligand is designed using the 
methods of the present invention or other ligand design methods known in the art. 
The structural features that differentiate, for example, a GRa from a GRp can be 
targeted in ligand design. Thus, for example, a ligand can be designed that will 
recognize GRa, while not interacting with other GRs or even with moieties having 
similar structural features. Prior to the disclosure of the present invention, the 
ability to target a GR subtype was unattainable. 

The present invention also pertains to a method for designing an agonist or 
modulator with desired levels of activity on at least two subtypes, GRa and GRp. 
In a preferred embodiment, the method comprises obtaining atomic coordinates 
for structures of the GRa and/or GRp ligand binding domains. The structures can 
comprise GRa and GRp, each bound to various different ligands, and also can 
comprise structures where no ligand is present. The structures can also comprise 
models where a compound has been docked into a particular GR using a 
molecular docking procedure, such as the MVP program disclosed herein. 
Optionally, the structures are rotated and translated so as to superimpose 
corresponding Ca or backbone atoms; this facilitates the comparison of 
structures. 

The GRa and GRp structures can also be compared using a computer 
graphics system to identify regions of the ligand binding site that have similar 
shape and electrostatic character, and to identify regions of the ligand binding site 
that are narrowed or constricted in one or both of the GRs, particulariy as 
compared to other NRs. Since these three GRs are subject to conformational 
changes, attention is paid to the range of motion observed for each protein atom 
over the whole collection of structures. The ligand structures, including both those 
determined by X-ray crystallography and those modeled using molecular docking 
procedures, can be examined using a computer graphics system to identify 
ligands where a chemical modification could increase or decrease binding to a 
particular GR, or decrease activity against a particular GR. Additionally or 
alternatively, the chemical modification can introduce a group into a volume that is 
normally occupied by an atom of that GR. 

Optionally, to selectively decrease activity against a particular GR, the 
chemical modification can be made so as to occupy volume that is normally 
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occupied by atoms of that particular GR, but not by atoms of the other GRs. To 
increase activity against a particular GR, a chemical modification can be made 
that improves interactions with that particular GR. To selectively increase activity 
against a particular GR, a chemical modification can be made that improves the 
5 interactions with that particular GR, but does not improve the interactions with the 
other GRs. Other design principles can also be used to increase or decrease 
activity on a particular GR. 

Thus, various possible compounds nnd chemical modfications can be 
considered and compared graphically, and with molecular modeling tools, for 

1 0 synthetic feasibility and likelihood of achieving the desired profile of activation of 
GRa and GRp. Compounds that appear synthetically feasible and that have a 
good likelihood of achieving the desired profile are synthesized. The compounds 
can then be tested for binding and/or activation of GRa and GRp, and tested for 
their overall biological effect. 

15 A method of identifying a NR modulator that selectively modulates the 

biological activity of one NR compared to GRa is also disclosed. In one 
embodiment, the method comprises: (a) providing an atomic structure coordinate 
set describing a GRa ligand binding domain structure and at least one other 
atomic structure coordinate set describing a NR ligand binding domain, each 

20 ligand binding domain comprising a ligand binding site; (b) comparing the atomic 
structure coordinate sets to identify at least one diference between the sets; (c) 
designing a candidate ligand predicted to interact with the difference of step (b); 
(d) synthesizing the candidate ligand; and (e) testing the synthesized candidate 
ligand for an ability to selectively modulate a NR as compared to GRa, whereby a 

25 NR modulator that selectively modulates the biological activity NR compared to 
GRa is identified. 

Preferably, the GRa atomic structure coordinate set is the atomic structure 
coordinate set shown in Table 4. Optionally, the NR is selected from the group 
consisting of MR, PR, AR, GRp and isoforms thereof that have ligands that also 
30 bind GRa."" 
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— Method of Screening for Chemi cal and Biolog ic al Modulators of the 

Biological Activi ty of an NR, SR or GR 
A candidate substance identified according to a screening assay of the 
present invention has an ability to modulate the biological activity of an NR SR or 
GR or an NR, SR or GR LBD polypeptide. In a preferred embodiment, such a 
cand,date compound can have utility in the treatment of disorders and/or 
cond.t.ons and/or biological events associated with the biological activity of an NR, 
SR or GR or an NR, SR or GR LBD polypeptide, including transcription 
modulation. 

In a cell-free system, the method comprises the steps of establishing a 
control system comprising a GRa polypeptide and a ligand which is capable of 
binding to the polypeptide; establishing a test system comprising a GRa 
polypeptide, the ligand, and a candidate compound; and determining whether the 
candidate compound modulates the activity of the polypeptide by comparison of 
15 the test and control systems. A representative ligand can comprise 
dexamethasone or other small molecule, and in this embodiment, the biological 
activity or property screened can include binding affinity or transcription regulation. 
The GRa polypeptide can be in soluble or crystalline form. 

In another embodiment of the invention, a soluble or a crystalline form of a 
GRa polypeptide or a catalytic or immunogenic fragment or oligopeptide thereof, 
can be used for screening libraries of compounds in any of a variety of drug 
screening techniques. The fragment employed in such a screening can be affixed 
to a solid support. The formation of binding complexes, between a soluble or a 
crystalline GRa polypeptide and the agent being tested, will be detected. In a 
preferred embodiment, the soluble or crystalline GRa polypeptide has an amino 
acid sequence of any of SEQ ID NOs:4, 6, 8 or 10. When a GRa LBD 
polypeptide is employed, a preferred embodiment will include a soluble or a 
crystalline GRa polypeptide having the amino acid sequence of any of SEQ ID 
NOs:12, 14, 16 or 31. 

Another technique for drug screening which can be used provides for high 
throughput screening of compounds having suitable binding affinity to the protein 
of interest as described in published PCT application WO 84/03564, herein 
incorporated by reference. In this method, as applied to a soluble or crystalline 
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polypeptide of the present invention, large numbers of different small test 
compounds are synthesized on a solid substrate, such as plastic pins or some 
other surface. The test compounds are reacted with the soluble or crystalline 
polypeptide, or fragments thereof. Bound polypeptide is then detected by 
5 methods known to those of skill in the art. The soluble or crystalline polypeptide 
can also be placed directly onto plates for use in the aforementioned drug 
screening techniques. 

In yet another embodiment, a method of screening for a modulator of an 
NR, SR or GR or an NR, SR or GR LBD polypeptide comprises: providing a library 

10 of test samples; contacting a soluble or a crystalline form of an NR, SR or GR or a 
soluble or crystalline form of an NR, SR or GR LBD polypeptide with each test 
sample; detecting an interaction between a test sample and a soluble or a 
crystalline form of an NR, SR or GR or a soluble or a crystalline form of an NR, 
SR or GR LBD polypeptide; identifying a test sample that interacts with a soluble 

15 or a crystalline form of an NR, SR or GR or a soluble or a crystalline form of an 
NR, SR or GR LBD polypeptide; and isolating a test sample that interacts with a 
soluble or a crystalline form of an NR, SR or GR or a soluble or a crystalline form 
of an NR, SR or GR LBD polypeptide. 

In each of the foregoing embodiments, an interaction can be detected 

20 spectrophotometrically, radiologically, colorimetrically or immunologically. An 
interaction between a soluble or a crystalline form of an NR, SR or GR or a 
soluble or a crystalline form of an NR, SR or GR LBD polypeptide and a test 
sample can also be quantified using methodology known to those of skill in the art. 
In accordance with the present invention there is also provided a rapid and 

25 high throughput screening method that relies on the methods described above. 
This screening method comprises separately contacting each of a plurality of 
substantially identical samples with a soluble or a crystalline form of an NR, SR or 
GR or a soluble or a crystalline form of an NR, SR or GR LBD and detecting a 
resulting binding complex. In such a screening method the plurality of samples 

30 preferably comprises more than about 10 4 samples, or more preferably comprises 
more than about 5 x 10 4 samples. 

In another embodiment, a method for identifying a substance that 
modulates GR LBD function is also provided. In a preferred embodiment, the 
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method comprises: (a) isolating a GR polypeptide of the present invention- (b) 
exposing the isolated GR polypeptide to a plurality of substances; (c) assaying 
b,nd,ng of a substance to the isolated GR polypeptide; and (d) selecting a 
substance that demonstrates specific binding to the isolated GR LBD polypeptide 
By the term "exposing the GR polypeptide to a plurality of substances", it is meant 
both in pools and as mutiple samples of "discrete" pure substances. 
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^ Method of Identifying Compoun ds Which Inhibit Ligand Binding 
In one aspect of the present invention, an assay method for identifying a 
compound that inhibits binding of a ligand to an NR, SR or GR polypeptide is 
d.sclosed. A ligand, such as dexamethasone (which associates with at least GR), 
can be used in the assay method as the ligand against which the inhibition by a 
test compound is gauged. In the following discussion of Section IX.D it will be 
understood that although GR is used as an example, the method is equally 
applicable to any of NR, SR or GR polypeptide The method comprises (a) 
.ncubating a GR polypeptide with a ligand in the presence of a test inhibitor 
compound; (b) determining an amount of ligand that is bound to the GR 
polypeptide, wherein decreased binding of ligand to the GR polypeptide in the 
presence of the test inhibitor compound relative to binding in the absence of the 
test inhibitor compound is indicative of inhibition; and (c) identifying the test 
compound as an inhibitor of ligand binding if decreased ligand binding is 
observed. Preferably, the ligand is dexamethasone. 

In another aspect of the present invention, the disclosed assay method can 
be used in the structural refinement of candidate GR inhibitors. For example, 
multiple rounds of optimization can be followed by gradual structural changes in a 
strategy of inhibitor design. A strategy such as this is made possible by the 
disclosure of the atomic coordinates of the GRa LBD. 

- Design, Preparation and Struct ural Analysis of Additional NR. SR and GR 
Polypeptides a nd NR, SR and GR LBD Mutants and Structural Equivalents 
The present invention provides for the generation of NR, SR and GR 

polypeptides and NR, SR or GR mutants (preferably GRa and GRa LBD 
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mutants). and the ability to solve the crystal structures of those that crystallize. 
Indeed, a GRa LBD havingfa point mutation was crystallized and solved in one 
aspect of the present invention. Thus, an aspect of the present invention involves 
the use of both targeted and random mutagenesis of the GR gene for the 
5 production of a recombinant protein with improved or desired characteristics for 
the purpose of crystallization, characterization of biologically relevant protein- 
protein interactions, and compound screening assays, or for the production of a 
recombinant protein having other desirable characteristic(s). Polypeptide 
products produced by the methods of the present invention are also disclosed 
10 herein. 

The structure coordinates of a NR, SR or GR LBD provided in accordance 
with the present invention also facilitate the identification of related proteins or 
enzymes analogous to GRa in function, structure or both, (for example, a GRp) 
which can lead to novel therapeutic modes for treating or preventing a range of 
15 disease states. More particularly, through the provision of the mutagenesis 
approaches as well as the three-dimensional structure of a GRa LBD disclosed 
herein, desirable sites for mutation are identified. 

X.A. Sterically Similar Compounds 

20 A further aspect of the present invention is that sterically similar 

compounds can be formulated to mimic the key portions of an NR, SR or GR LBD 
structure. Such compounds are functional equivalents. The generation of a 
structural functional equivalent can be achieved by the techniques of modeling 
and chemical design known to those of skill in the art and described herein. 

25 Modeling and chemical design of NR, SR or GR and NR, SR or GR LBD structural 
equivalents can be based on the structure coordinates of a crystalline GRa LBD 
polypeptide of the present invention. It will be understood that all such sterically 
similar constructs fall within the scope of the present invention. 

30 X.B. NR, SR and GR Polypeptides 

The generation of chimeric GR polypeptides is also an aspect of the 
present invention. Such a chimeric polypeptide can comprise an NR, SR or GR 
LBD polypeptide or a portion of an NR, SR or GR LBD, (e.g. a GRa LBD) that is 
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fused to a candidate polypeptide or a suitable region of the candidate polypeptide 
for example GR P . Throughout the present disclosure it is intended that the term 
"mutant" encompass not only mutants of an NR, SR or GR LBD polypeptide but 
ch.menc proteins generated using an NR, SR or GR LBD as well It is thus 
-ntended that the following discussion of mutant NR, SR and GR LBDs apply 
mutat,s mutandis to chimeric NR, SR and GR polypeptides and NR, SR and GR 
LBD polypeptides and to structural equivalents thereof. 

In accordance with the present invention, a mutation can be directed to a 
particular site or combination of sites of a wild-type NR, SR or GR LBD For 
example, an accessory binding site or the binding pocket can be chosen for 
mutagenesis. Similarly, a residue having a location on, at or near the surface of 
the polypeptide can be replaced, resulting in an altered surface charge of one or 
more charge units, as compared to the wild-type NR, SR or GR and NR, SR or 
GR LBDs. Alternatively, an amino acid residue in an NR, SR or GR or an NR. SR 
or GR LBD can be chosen for replacement based on its hydrophilic or 
hydrophobic characteristics. 

Such mutants can be characterized by any one of several different 
propert.es, i.e. a "desired" or "predetermined" characteristic as compared with the 
wld type NR, SR or GR LBD. For example, such mutants can have an altered 
surface charge of one or more charge units, or can have an increase in overall 
stability. Other mutants can have altered substrate specificity in comparison with 
or a higher specific activity than, a wild-type NR, SR or GR or an NR, SR or GR 
LBD. 

NR. SR or GR and NR, SR or GR LBD mutants of the present invention 
can be generated in a number of ways. For example, the wild-type sequence of 
an NR, SR or GR or an NR. SR or GR LBD can be mutated at those sites 
identified using this invention as desirable for mutation, by means of 
oligonucleotide-directed mutagenesis or other conventional methods, such as 
deletion. Alternatively, mutants of an NR, SR or GR or an NR. SR or GR LBD can 
be generated by the site-specific replacement of a particular amino acid with an 
unnaturally occurring amino acid. In addition. NR. SR or GR or NR. SR or GR 
LBD mutants can be generated through replacement of an amino acid residue, for 
example, a particular cysteine or methionine residue, with selenocysteine or 
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selenomethionine. This can be achieved by growing a host organism capable of 
expressing either the wild-type or mutant polypeptide on a growth medium 
depleted of either natural cysteine or methionine (or both) but enriched in 
selenocysteine or selenomethionine (or both). 
5 As disclosed in the Examples presented below, mutations can be 

introduced into a DNA sequence coding for an NR, SR or GR or an NR, SR or GR 
LBD using synthetic oligonucleotides. These oligonucleotides contain nucleotide 
sequences flanking the desired mutation sites. Mutations can be generated in the 
full-length DNA sequence of an NR, SR or GR or an NR, SR or GR LBD or in any 
10 sequence coding for polypeptide fragments of an NR, SR or GR or an NR, SR or 
GR LBD. 

According to the present invention, a mutated NR, SR or GR or NR, SR or 
GR LBD DNA sequence produced by the methods described above, or any 
alternative methods known in the art, can be expressed using an expression 

1 5 vector. An expression vector, as is well known to those of skill in the art, typically 
includes elements that permit autonomous replication in a host cell independent of 
the host genome, and one or more phenotypic markers for selection purposes. 
Either prior to or after insertion of the DNA sequences surrounding the desired 
NR, SR or GR or NR, SR or GR LBD mutant coding sequence, an expression 

20 vector also will include control sequences encoding a promoter, operator, 
ribosome binding site, translation initiation signal, and, optionally, a repressor 
gene or various activator genes and a signal for termination. In some 
embodiments, where secretion of the produced mutant is desired, nucleotides 
encoding a "signal sequence" can be inserted prior to an NR, SR or GR or an NR, 

25 SR or GR LBD mutant coding sequence. For expression under the direction of 
the control sequences, a desired DNA sequence must be operatively linked to the 
control sequences; that is, the sequence must have an appropriate start signal in 
front of the DNA sequence encoding the NR, SR or GR or NR, SR or GR LBD 
mutant, and the correct reading frame to permit expression of that sequence 

30 under the control of the control sequences and production of the desired product 
encoded by that NR, SR or GR or NR, SR or GR LBD sequence must be 
maintained. 
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After a review of the disclosure of the present invention presented herein 
any of a wide variety of well-known available expression vectors can be useful to 
express a mutated coding sequence of this invention. These include for example 
vectors consisting of segments of chromosomal, non-chromosomal and synthetic 
DNA sequences, such as various known derivatives of SV40, known bacterial 
plasmids, e.g., plasmids from E. coli including col E1, pCR1, p B R322, pMB9 and 
their derivatives, wider host range plasmids, e.g., RP4, phage DMAs, e.g., the 
numerous derivatives of phage X, e.g., NM 989, and other DNA phages, e.g., M13 
and filamentous single stranded DNA phages, yeast plasmids and vectors derived 
from combinations of plasmids and phage DNAs, such as plasmids which have 
been modified to employ phage DNA or other expression control sequences. In 
the preferred embodiments of this invention, vectors amenable to expression in a 
pET-based expression system are employed. The pET expression system is 
available from Novagen/lnvitrogen, Inc., Carlsbad, California. Expression and 
screening of a polypeptide of the present invention in bacteria, preferably E. coli, 
is a preferred aspect of the present invention. 

In addition, any of a wide variety of expression control sequences- 
sequences that control the expression of a DNA sequence when operatively 
linked to it-can be used in these vectors to express the mutated DNA sequences 
according to this invention. Such useful expression control sequences, include, 
for example, the early and late promoters of SV40 for animal cells, the lac system 
the trp system the TAC or TRC system, the major operator and promoter regions 
of phage X, the control regions of fd coat protein, all for E. coli, the promoter for 3- 
phosphoglycerate kinase or other glycolytic enzymes, the promoters of acid 
phosphatase, e.g., Pho5. the promoters of the yeast a-mating factors for yeast, 
and other sequences known to control the expression of genes of prokaryotic or 
eukaryotic cells or their viruses, and various combinations thereof. 

A wide variety of hosts are also useful for producing mutated NR. SR or GR 
and NR, SR or GR LBD polypeptides according to this invention. These hosts 
include, for example, bacteria, such as E. coli, Bacillus and Streptomyces, fungi, 
such as yeasts, and animal cells, such as CHO and COS-1 cells, plant cells, 
insect cells, such as SF9 cells, and transgenic host cells. Expression and 
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screening of a polypeptide of the present invention in bacteria, preferably E. co//, 
is a preferred aspect of the present invention. 

It should be understood that not all expression vectors and expression 
systems function in the same way to express mutated DNA sequences of this 
5 invention, and to produce modified NR, SR or GR and NR, SR or GR LBD 
polypeptides or NR, SR or GR or NR, SR or GR LBD mutants. Neither do all 
hosts function equally well with the same expression system. One of skill in the 
art can, however, make a selection among these vectors, expression control 
sequences and hosts without undue experimentation and without departing from 
10 the scope of this invention. For example, an important consideration in selecting a 
vector will be the ability of the vector to replicate in a given host. The copy 
number of the vector, the ability to control that copy number, and the expression 
of any other proteins encoded by the vector, such as antibiotic markers, should 
also be considered. 

15 In selecting an expression control sequence, a variety of factors should 

also be considered. These include, for example, the relative strength of the 
system, its controllability and its compatibility with the DNA sequence encoding a 
modified NR, SR or GR or NR, SR or GR LBD polypeptide of this invention, with 
particular regard to the formation of potential secondary and tertiary structures. 

20 Hosts should be selected by consideration of their compatibility with the 

chosen vector, the toxicity of a modified polypeptide to them, their ability to 
express mature products, their ability to fold proteins correctly, their fermentation 
requirements, the ease of purification of a modified GR or GR LBD and safety. 
Within these parameters, one of skill in the art can select various 

25 vector/expression control system/host combinations that will produce useful 
amounts of a mutant polypeptide. A mutant polypeptide produced in these 
systems can be purified, for example, via the approaches disclosed in the 
Examples. 

Once a mutation(s) has been generated in the desired location, such as an 
30 active site or dimerization site, the mutants can be tested for any one of several 
properties of interest, i.e. "desired" or "predetermined" positions. For example, 
mutants can be screened for an altered charge at physiological pH. This property 
can be determined by measuring the mutant polypeptide isoelectric point (pi) and 



SDOCID: <WO__03015692A2J_> 



0 



WO 03/015692 

PCT/US02/22648 

-77- 

comparing the observed value with that of the wild-type parent. Isoelectric point 
can be measured by gel-electrophoresis according to the method of Wellner 
(Wellner, (1971) Anal. Chem. 43: 597). A mutant polypeptide containing a 
replacement amino acid located at the surface of the enzyme, as provided by the 
structural information of this invention, can lead to an altered surface charge and 
an altered pi. 

^ Generation of an Engineered NR. SR or GR or NR. SR or GR I BP 
Mutants 

In another aspect of the present invention, a unique NR, SR or GR or NR, 
SR or GR LBD polypeptide is generated. Such a mutant can facilitate purification 
and the study of the structure and the ligand-binding abilities of a NR, SR or GR 
polypeptide. Thus, an aspect of the present invention involves the use of both 
targeted and random mutagenesis of the GR gene for the production of a 
recombinant protein with improved solution characteristics for the purpose of 
crystallization, characterization of biologically relevant protein-protein interactions, 
and compound screening assays . or for the production of a recombinant 
polypeptide having other characteristics of interest. Expression of the polypeptide 
in bacteria, preferably E. coli, is also an aspect of the present invention. 

In one embodiment, targeted mutagenesis was performed using a 
sequence alignment of several nuclear receptors, primarily steroid receptors. 
Several residues that were hydrophobic in GR and hydrophilic in other receptors 
were chosen for mutagenesis. Most of these residues were predicted to be 
solvent exposed hydrophobic residues in GR. Therefore, mutations were made to 
change these hydrophobic residues to hydrophilic in attempt to improve the 
solubility and stability of E.co//-expressed GR LBD. Table 2 immediately below 
presents a list of mutations (for that were made and tested for expression in E. 
coli. 
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Table 2 

Mutations of the GR LBD (521-777) Gene for 
Testing Solution Solubility and Stability 



Single mutations 



Double mutations 



Triple mutations 



V552K 

W557S 

F602S 

F602D 

F602E 

L636E 

Y648Q 

W712S 

L741R 

F602Y 

F602T 

F602N 

F602C 



L535T/V538S 

V552K/W557S 

L636E/C638S 



M691T/V702T/W712T 
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Random mutagenesis can be performed on residues where a significant 
drfference, hydrophobic versus hydrophilic, is observed between GR and other 
steroid receptors based on sequence alignment. Such positions can be 
random,zed by oligo-directed or cassette mutagenesis. A GR LBD protein library 
can be sorted by an appropriate display system to select mutants with improved 
so.ut.on properties. Residues in GR that meet the criteria for such an approach 
-nclude: V538, V552, W557, F602, L636, Y648, Y660, L685, M691, V702 W712 
L733, and Y764. In addition, residues predicted to neighbor these positions could 
also be randomized. 

In another embodiment, complete random mutagenesis can be performed 
on any residue within the context of the GR LBD. A method such as error 
.ncorporating PGR or chemical-based mutagenesis can be used to introduce 
mutator* in an unbiased manner. These methods randomize the position of 
mutation as well as the nature of the mutated residue. A completely random GR 
LBD horary can be screened for improved expression with the appropriate 
express.on or display system. Ideally, the selection method should identify mutant 
protons with increased expression, solubility, stability, and/or activity A 
technique well suited for this purpose is the "peptides-on-p.asmid" disp.ay system 
that utilizes the DNA-binding activity of the lac repressor (Lad). GR. or another 
nuclear receptor LBD. can be expressed as a fusion to either Lad or a fragment of 
Lacl. such as the "headpiece dimer", that comprises the DNA-binding domain 
Because the p.asmid that expresses the fusion protein also comprises a lac 
operon b.nding site, the protein will be physically coupled to the plasmid GR 
mutants that produce soluble protein can then be isolated using either the 
coact.vator peptide- or ligand-binding activity of the receptor. Table 2A below 
shows mutations that were prepared using the Lacl-based "peptides-on-plasmids" 
technique with GR LBD. 



Table 2A 

Random Mutations of the GR LBD ( 521-777) Gene for Imp roving 
Solution Solubility and Stability 
Single mutations SEQ ID NO Double Mutations SEQ ID NO 

W557R 33 F602L/A580T 38 
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Q615L 34 L563F/G583C 39 

Q615H 35 L664H/M752T 40 

A574T 36 L563F/T744N 41 

L620M 37 

5 

A method of modifying a test NR polypeptide is thus disclosed. The 
method can comprise: providing a test NR polypeptide sequence having a 
characteristic that is targeted for modification; aligning the test NR polypeptide 
sequence with at least one reference NR polypeptide sequence for which an X-ray 

10 structure is available, wherein the at least one reference NR polypeptide 
sequence has a characteristic that is desired for the test NR polypeptide; building 
a three-dimensional model for the test NR polypeptide using the three- 
dimensional coordinates of the X-ray structure(s) of the at least one reference 
polypeptide and its sequence alignment with the test NR polypeptide sequence; 

15 examining the three-dimensional model of the test NR polypeptide for differences 
with the at least one reference polypeptide that are associated with the desired 
characteristic; and mutating at least one amino acid residue in the test NR 
polypeptide sequence located at a difference identified above to a residue 
associated with the desired characteristic, whereby the test NR polypeptide is 

20 modified. By the term "associated with a desired characteristic" it is meant that a 
residue is found in the reference polypeptide at a point of difference wherein the 
difference provides a desired characteristic or phenotype in the reference 
polypeptide. 

A method of altering the solubility of a test NR polypeptide is also disclosed 
25 in accordance with the present invention. In a preferred embodiment, the method 
comprises: (a) providing a reference NR polypeptide sequence and a test NR 
polypeptide sequence; (b) comparing the reference NR polypeptide sequence and 
the test NR polypeptide sequence to identify one or more residues in the test NR 
sequence that are more or less hydrophilic than a corresponding residue in the 
30 reference NR polypeptide sequence; and (c) mutating the residue in the test NR 
polypeptide sequence identified in step (b) to a residue having a different 
hydrophilicity, whereby the solubility of the test NR polypeptide is altered. 
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By the term "altering" it is meant any change in the solubility of the test NR 
polypept.de, including preferab.y a change to make the polypeptide more soluble 
Such approaches to obtain soluble proteins for crystallization studies have been 
successfully demonstrated in the case of H.V integration intergrase and the 
human leptin cytokine. See Dyda, F., et a.., Science (1994) Dec 23; 
266(5193):1981-6; and Zhang et al., Nature (1997) May 8; 387(6629):206-9. 

Typically, such a change involves substituting a residue that is more 
hydrophilic than the wild type residue. Hydrophobic^ and hydrophi.icity criteria 
and comparison information are set forth herein below. Optionally, the reference 
NR polypeptide sequence is an AR or a PR sequence, and the test polypeptide 
sequence is a GR polypeptide sequence. Alternatively, the reference polypeptide 
sequence is a crystalline GR LBD. The comparing of step (b) is preferab.y by 
sequence alignment. More preferably, the screening is carried out in bacteria, 
even more preferably, in E. coli 

A method for modifying a test NR polypeptide to alter and preferably 
.mprove the solubility, stability in solution and other solution behavior, to alter and 
preferably improve the folding and stability of the folded structure, and to alter and 
preferably improve the ability to form ordered crystals is also provided in 
accordance with the present invention. The aforementioned characteristics are 
representative "desired" or "predetermined characteristics or phenotypes. 

In a preferred embodiment, the method comprises: 

(a) providing a test NR polypeptide sequence for which the solubility 
stability in solution, other solution behavior, tendency to fold properly, ability to 
form ordered crystals, or combination thereof is different from that desired; 

(b) aligning the test NR polypeptide sequence with the sequences of other 
reference NR polypeptides for which the X-ray structure is available and for which 
the solution properties, folding behavior and crystallization properties are closer to 
those desired; 

(c) building a three-dimensional model for the test NR polypeptide using the 
three-dimensional coordinates of the X-ray structure(s) of one or more of the 
reference polypeptides and their sequence alignment with the test NR polypetide 
sequence; 

(d) optionally, optimizing the side-chain conformations in the three- 
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dimensional model by generating many alternative side-chain conformations, 
refining by energy minimization, and selecting side-chain conformations with lower 
energy; 

(e) examining the three-dimensional model for the test NR graphically for 
lipophilic side-chains that are exposed to solvent, for clusters of two or more 
lipophilic side-chains exposed to solvent, for lipophilic pockets and clefts on the 
surface of the protein model, and in particular for sites on the surface of the 
protein model that are more lipophilic than the corresponding sites on the 
structure(s) of the reference NR polypeptide(s); 

(f) for each residue identified in step (e), mutating the amino acid to an 
amino acid with different hydrophilicity, and usually to a more hydrophilic amino 
acid, whereby the exposed lipophilic sites are reduced, and the solution properties 
improved; 

(g) examining the three-dimensional model graphically at each site where 
the amino acid in the test NR polypeptide is different from the amino acid at the 
corresponding position in the reference NR polypeptide, and checking whether the 
amino acid in the test NR polypeptide makes favorable interactions with the atoms 
that lie around it in the three-dimensional model, considering the side-chain 
conformations predicted in steps (c) and, optionally step (d), as well as likely 
alternative conformations of the side-chains, and also considering the possible 
presence of water molecules (for this analysis, an amino acid is considered to 
make "favorable interactions with the atoms that lie around it" if these interactions 
are more favorable than the interactions that would be obtained if it was replaced 
by any of the 19 other naturally-occurring amino acids); 

(h) for each residue identified in step (g) as not making favorable 
interactions with the atoms that lie around it, mutating the residue to another 
amino acid that could make better interactions with the atoms that lie around it, 
thereby promoting the tendency for the test NR polypeptide to fold into a stable 
structure with improved solution properties, less tendency to unfold, and greater 
tendency to form ordered crystals; 

(i) examining the three-dimensional model graphically at each residue 
position where the amino acid in the test NR polypeptide is different from the 
amino acid at the corresponding position in the reference NR polypeptide, and 
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checking whether the stenc packing, hydrogen bonding and other energetic 
.nteractions could be improved by mutating that residue or any one or more of the 
surrounding residues lying within 8 angstroms in the three-dimensional model- 

0) for each residue position identified in step (i) as potentially allowing an 
.mprovement in the packing, hydrogen bonding and energetic interactions 
mutafng those residues individually or in combination to residues that could 
-mprove the packing, hydrogen bonding and energetic interactions, thereby 
promoting the tendency for the test NR polypeptide to fold into a stable structure 
w.th improved solution properties, less tendency to unfold, and greater tendency 
i U to form ordered crystals. 

By the term "graphically" it is meant through the use of computer aided 
graph,cs, such by the use of a software package disclosed herein above 
Opbonally, in this embodiment, the reference NR polypeptide is AR, or preferably 
PR, when the test NR polypeptide is GRa. Alternatively, the reference NR 
5 polypeptide is GRa, and the test NR polypeptide is GRB or MR. 

An isolated GR polypeptide comprising a mutation in a ligand binding 
domain, wherein the mutation alters the solubility of the ligand binding domain is 
also disclosed. An isolated GR polypeptide, or functional portion thereof, having 
one or more mutations comprising a substitution of a hydrophobic amino acid 
residue by a hydrophilic amino acid residue in a ligand binding domain is also 
d.sclosed. Preferably, in each case, the mutation can be at a residue selected 
from the group consisting of V552, W557. F602. L636, Y648, W712 L741 L535 
V538, C638, M691, V702, Y648, Y660, L685, M691, V702, W712, L733 Y764 
and combinations thereof. More preferably, the mutation is selected from the 
group consisting of V552K, W557S. F602S, F602D, F602E, F602Y F602T 
F602N, F602C, L636E, Y648Q, W712S, L741R. L535T, V538S, C638S, M691T,' 
V702T, W712T and combinations thereof. Even more preferably, the mutation is 
made by targeted point or randomizing mutagenesis. Hydrophobicity and 
hyrdrophilicity criteria and comparison information are set forth herein below. 

As discussed above, the GRa gene can be translated from its mRNA by 
alternative initiation from an internal ATG codon (Yudt & Cidlowski (2001) Molec 
Endocrinol. 15: 1093-1103). This codon codes for methionine at position 27 and 
translation from this position produces a slightly smaller protein. These two 
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isoforms, translated from the same gene, are referred to as GR-A and GR-B. It 
has been shown in a cellular system that the shorter GR-B form is more effective 
in initiating transcription from a GRE compared to GR-A. Additionally, another 
form of GR, called GRp is produced by an alternative splicing event. The GRp 
5 protein differs from GRa at the very C-terminus, where the final 50 amino acids 
are replaced with a 15 amino acid segment. These two isoforms are 100% 
identical up to amino acid 727. No sequence similarity exists between GRa and 
GRp at the C-terminus beyond position 727. GRp has been shown to be a 
dominant negative regulator of GRa-mediated gene transcription (Oakley, Sar & 

10 Cidlowski (1996) J. Biol. Chem. 271: 9550-9559). It has been suggested that 
some of the tissue specific effects observed with glucocorticoid treatment may in 
part be due to the presence of varying amounts of isoform in certain cell-types. 
This method is also applicable to any other subfamily so organized. Thus, while 
the amino acid residue numbers referenced above pertain to GR-A, the 

15 polypeptides of the present invention also have a mutation at an analogous 
position in any polypeptide based on a sequence alignment (such as prepared by 
BLAST or other approach disclosed herein or known in the art) to GRa. which are 
not forth herein for convenience. 

As used in the following discussion, the terms "engineered NR, SR or GR", 

20 "engineered NR, SR or GR LDB", "NR, SR or GR mutant", and "NR, SR or GR 
LBD mutant" refers to polypeptides having amino acid sequences that contain at 
least one mutation in the wild-type sequence, including at an analogous position 
in any polypeptide based on a sequence alignment to GRa. The terms also refer 
to NR, SR or GR and NR, SR or GR LBD polypeptides which are capable of 

25 exerting a biological effect in that they comprise all or a part of the amino acid 
sequence of an engineered mutant polypeptide of the present invention, or cross- 
react with antibodies raised against an engineered mutant polypeptide, or retain 
all or some or an enhanced degree of the biological activity of the engineered 
mutant amino acid sequence or protein. Such biological activity can include the 

30 binding of small molecules in general, the binding of glucocorticoids in particular 
and even more particularly the binding of dexamethasone. 

The terms "engineered NR, SR or GR LBD" and "NR, SR or GR LBD 
mutant" also includes analogs of an engineered NR, SR or GR polypeptide or NR, 
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SR or GR LBD or GR LBD mutant polypeptide. By "analog" is intended that a 
° r po, yP e P tide sequence can contain alterations relative to the sequences 
d.sclosed herein, yet retain ali or some or an enhanced degree of the biological 
act.v,ty of those sequences. Analogs can be derived from genomic nucleotide 
sequences or from other organisms, or can be created synthetically. Those of 
atoll m the art will appreciate that other analogs, as yet undisclosed or 
und.scovered, can be used to design and/or construct mutant analogs. There is 
no need for an engineered mutant polypeptide to comprise all or substantially all 
of the amino acid sequence of the wild type polypeptide (e.g. SEQ ID NOs:2 or 
10). Shorter or longer sequences are anticipated to be of use in the invention- 
shorter sequences are herein referred to as "segments". Thus, the terms 
"engineered NR, SR or GR LBD" and "NR. SR or GR LBD mutant" also includes 
fusion, chimeric or recombinant engineered NR, SR or GR LBD or NR, SR or GR 
LBD mutant polypeptides and proteins comprising sequences of the present 
invention. Methods of preparing such proteins are disclosed herein above. 

X D - Sequence Similarity and Identity 

As used herein, the term "substantially similar as applied to GR means 
that a particular sequence varies from nucleic acid sequence of any of odd 
numbered SEQ ID NOs:1-15, or the amino acid sequence of any of even 
numbered SEQ ID NOs:2-16 by one or more deletions, substitutions, or additions 
the net effect of which is to retain at least some of biological activity of the natural 
gene, gene product, or sequence. Such sequences include "mutant" or 
"polymorphic" sequences, or sequences in which the biological activity and/or the 
physical properties are altered to some degree but retains at least some or an 
enhanced degree of the original biological activity and/or physical properties. In 
determining nucleic acid sequences, all subject nucleic acid sequences capable of 
encoding substantially similar amino acid sequences are considered to be 
substantially similar to a reference nucleic acid sequence, regardless of 
differences in codon sequences or substitution of equivalent amino acids to create 
biologically functional equivalents. 
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X.D.1. Sequences That are Substantially Identical to an Engineered 

NR, SR or GR or NR, SR or GR LBD Mutant Sequence of the 
Present Invention 

Nucleic acids that are substantially identical to a nucleic acid sequence of 
5 an engineered NR, SR or GR or NR, SR or GR LBD mutant of the present 
invention, e.g. allelic variants, genetically altered versions of the gene, etc., bind to 
an engineered NR, SR or GR or NR, SR or GR LBD mutant sequence under 
stringent hybridization conditions. By using probes, particularly labeled probes of 
DNA sequences, one can isolate homologous or related genes. The source of 

10 homologous genes can be any species, e.g. primate species; rodents, such as 
rats and mice, canines, felines, bovines, equines, yeast, nematodes, etc. 

Between mammalian species, e.g. human and mouse, homologs have 
substantial sequence similarity, i.e. at least 75% sequence identity between 
nucleotide sequences. Sequence similarity is calculated based on a reference 

15 sequence, which can be a subset of a larger sequence, such as a conserved 
motif, coding region, flanking region, etc. A reference sequence will usually be at 
least about 18 nt long, more usually at least about 30 nt long, and can extend to 
the complete sequence that is being compared. Algorithms for sequence analysis 
are known in the art, such as BLAST, described in Altschul et al„ (1990) J. Mol. 

20 Biol. 215: 403-10. Software for performing BLAST analyses is publicly available 
through the National Center for Biotechnology Information 
(http://www.ncbi.nlm.nih.gov/). 

This algorithm involves first identifying high scoring sequence pairs (HSPs) 
by identifying short words of length W in the query sequence, which either match 

25 or satisfy some positive-valued threshold score T when aligned with a word of the 
same length in a database sequence. T is referred to as the neighborhood word 
score threshold. These initial neighborhood word hits act as seeds for initiating 
searches to find longer HSPs containing them. The word hits are then extended in 
both directions along each sequence for as far as the cumulative alignment score 

30 can be increased. Cumulative scores are calculated using, for nucleotide 
sequences, the parameters M (reward score for a pair of matching residues; 
always > 0) and N (penalty score for mismatching residues; always < 0). For 
amino acid sequences, a scoring matrix is used to calculate the cumulative score. 
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Extension of the word hits in each direction are halted when the cumulative 
alignment score falls off by the quantity X from its maximum achieved value the 
cumulative score goes to zero or below due to the accumulation of one or more 
negative-scoring residue alignments, or the end of either sequence is reached. 
The BLAST algorithm parameters W, T, and X determine the sensitivity and speed 
of the alignment. The BLASTN program (for nucleotide sequences) uses as 
defaults a wordlength W=11, an expectation E=10, a cutoff of 100. M=5. N=-4, 
and a comparison of both strands. For amino acid sequences, the BLASTP 
program uses as defaults a wordlength (W) of 3. an expectation (E) of 10. and the 
BLOSUM62 scoring matrix. See Henikoff & Henikoff . (1989) Proc Natl Acad Sci 
U.S.A. 89: 10915. 

In addition to calculating percent sequence identity, the BLAST algorithm 
also performs a statistical analysis of the similarity between two sequences. See. 
e.g., Karlin and Altschul. (1993) Proc Natl Acad Sci U.S.A. 90: 5873-5887. One 
measure of similarity provided by the BLAST algorithm is the smallest sum 
probability (P(N)), which provides an indication of the probability by which a match 
between two nucleotide or amino acid sequences would occur by chance. For 
example, a test nucleic acid sequence is considered similar to a reference 
sequence if the smallest sum probability in a comparison of the test nucleic acid 
sequence to the reference nucleic acid sequence is less than about 0.1, more 
preferably less than about 0.01. and most preferably less than about 0.001. 

Percent identity or percent similarity of a DNA or peptide sequence can be 
determined, for example, by comparing sequence information using the GAP 
computer program, available from the University of Wisconsin Geneticist 
Computer Group. The GAP program utilizes the alignment method of Needleman 
et_aL. (1970) J. Mol. Biol. 48: 443, as revised by Smith et al. . (1981) Adv. Appl. 
Math. 2:482. Briefly, the GAP program defines similarity as the number of aligned 
symbols (i.e.. nucleotides or amino acids) which are similar, divided by the total 
number of symbols in the shorter of the two sequences. The preferred 
parameters for the GAP program are the default parameters, which do not impose 
a penalty for end gaps. See, e.g. . Schwartz et al. . eds., (1979). Atlas of Protein 
Sequence and Structure. Na tional Biomedical Research Foundation , pp. 357-358, 
and Gribskov et al. . (1986) Nucl. Acids. Res. 14: 6745. 
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The term "similarity" is contrasted with the term "identity". Similarity is 
defined as above; "identity", however, means a nucleic acid or amino acid 
sequence having the same amino acid at the same relative position in a given 
family member of a gene family. Homology and similarity are generally viewed as 
broader terms than the term identity. Biochemically similar amino acids, for 
example leucine/isoleucine or glutamate/aspartate, can be present at the same 
position— these are not identical per se, but are biochemically "similar." As 
disclosed herein, these are referred to as conservative differences or conservative 
substitutions. This differs from a conservative mutation at the DNA level, which 
changes the nucleotide sequence without making a change in the encoded amino 
acid, e.g. TCC to TCA, both of which encode serine. 

As used herein, DNA analog sequences are "substantially identical" to 
specific DNA sequences disclosed herein if: (a) the DNA analog sequence is 
derived from coding regions of the nucleic acid sequence shown in any one of odd 
numbered SEQ ID NOs:1-15 or (b) the DNA analog sequence is capable of 
hybridization with DNA sequences of (a) under stringent conditions and which 
encode a biologically active GRa or GRa LBD gene product; or (c) the DNA 
sequences are degenerate as a result of alternative genetic code to the DNA 
analog sequences defined in (a) and/or (b). Substantially identical analog proteins 
and nucleic acids will have between about 70% and 80%, preferably between 
about 81% to about 90% or even more preferably between about 91% and 99% 
sequence identity with the corresponding sequence of the native protein or nucleic 
acid. Sequences having lesser degrees of identity but comparable biological 
activity are considered to be equivalents. 

As used herein, "stringent conditions" means conditions of high stringency, 
for example 6X SSC, 0.2% polyvinylpyrrolidone, 0.2% Ficoll, 0.2% bovine serum 
albumin, 0.1% sodium dodecyl sulfate, 100 ng/ml salmon sperm DNA and 15% 
formamide at 68°C. For the purposes of specifying additional conditions of high 
stringency, preferred conditions are salt concentration of about 200 mM and 
temperature of about 45°C. One example of such stringent conditions is 
hybridization at 4X SSC, at 65°C, followed by a washing in 0.1XSSC at 65°C for 
one hour. Another exemplary stringent hybridization scheme uses 50% 
formamide, 4X SSC at 42°C. 
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In contrast, nucleic acids having sequence similarity are detected by 
hybridization under ,ower stringency conditions. Thus, sequence identity can be 
deterrnmed by hybridization under .ower stringency conditions, for examp.e. at 
50 C or h,gher and 0.1X SSC (9 mM NaCi/0.9 mM sodium citrate) and the 
5 sequences will remain bound when subjected to washing at 55°C in 1X SSC 

As used herein, the term "complementary sequences" means nucleic acid 
sequences that are base-paired according to the standard Watson-Crick 
comp,ementarity rules. The present invention also encompasses the use of 
nucleotide segments that are complementary to the sequences of the present 
iu invention. 

Hybridization can also be used for assessing complementary sequences 
and/or isolating complementary nucleotide sequences. As discussed above 
nuc.e,c acid hybridization wi.l be affected by such conditions as salt concentration' 
temperature, or organic solvents, in addition to the base composition, length of the 
complementary strands, and the number of nucleotide base mismatches between 
the hybridizing nucleic acids, as will be readily appreciated by those skilled in the 
art. stringent temperature conditions will generally include temperatures in 
excess of about 30»C, typically in excess of about 37*C, and preferably in excess 
of about 45°C. Stringent salt conditions will ordinarily be less than about 1 000 
» mM, typically less than about 500 mM. and preferably less than about 200 mM 
However, the combination of parameters is much more important than the 
measure of any single parameter. See, e.g. , Wetmur & Davidson (1968) J. Mo, 
Biol. 31: 349-70. Determining appropriate hybridization conditions to identify 
and/or .solate sequences containing high levels of homology is well known in the 
art. See^, S^ook.eLaL, (1989) Molecular Clonino: A Laboratory Mm,,,. . 
Cold Spring Harbor, New York. ' 
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Functional Equivalents of an Engineered NR, SR or GR or 
NR, SR, GR LBD Mutant Nucleic Acid Sequence of th» 
30 Present Invention 

As used herein, the term "functionally equivalent codon" is used to refer to 
codons that encode the same amino acid, such as the ACG and AGU codons for 
F ° r eXamp,e ' GRa or G *« LBD-encoding nucleic acid sequences 



serine. 
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comprising any one of odd numbered SEQ ID NOs:1-15, which have functionally 
equivalent codons are covered by the present invention. Thus, when referring to 
the sequence example presented in odd numbered SEQ ID NOs:1-15, applicants 
provide substitution of functionally equivalent codons into the sequence example 
5 of in odd numbered SEQ ID NOs:1-15. Thus, applicants are in possession of 
amino acid and nucleic acids sequences which include such substitutions but 
which are not set forth herein in their entirety for convenience. 

It will also be understood by those of skill in the art that amino acid and 
nucleic acid sequences can include additional residues, such as additional N- or 
10 C-terminal amino acids or 5" or 3' nucleic acid sequences, and yet still be 
essentially as set forth in one of the sequences disclosed herein, so long as the 
sequence retains biological protein activity where polypeptide expression is 
concerned. The addition of terminal sequences particularly applies to nucleic acid 
sequences which can, for example, include various non-coding sequences 
15 flanking either of the 5" or 3' portions of the coding region or can include various 
internal sequences, i.e., introns, which are known to occur within genes. 

X.D.3. Biological Equivalents 

The present invention envisions and includes biological equivalents of a 
20 engineered NR, SR or GR or NR, SR or GR LBD mutant polypeptide of the 
present invention. The term "biological equivalent" refers to proteins having amino 
acid sequences which are substantially identical to the amino acid sequence of an 
engineered NR, SR or GR LBD mutant of the present invention and which are 
capable of exerting a biological effect in that they are capable of binding small 
25 molecules or cross-reacting with anti- NR, SR or GR or NR, SR or GR LBD 
mutant antibodies raised against an engineered mutant NR, SR or GR or NR, SR 
or GR LBD polypeptide of the present invention. 

For example, certain amino acids can be substituted for other amino acids 
in a protein structure without appreciable loss of interactive capacity with, for 
30 example, structures in the nucleus of a cell. Since it is the interactive capacity and 
nature of a protein that defines that protein's biological functional activity, certain 
amino acid sequence substitutions can be made in a protein sequence (or the 
nucleic acid sequence encoding it) to obtain a protein with the same, enhanced, or 
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antagonistic properties. Such properties can be achieved by interaction with the 
norma, targets of the protein, but this need not be the case, and the bio.ogica. 
acfvty of the invention is not limited to a particular mechanism of action. It is thus 
,n accordance with the present invention that various changes can be made in the 
ammo acid sequence of an engineered NR, SR or GR or NR, SR or GR LBD 
mutant polypeptide of the present invention or its undenting nucleic acid 
sequence without appreciable loss of biological utility or activity 

Biologically equivalent polypeptides, as used herein, are polypeptides in 
wh.ch certain, but not most or all, of the amino acids can be substituted Thus 
when refemng to the sequence examples presented in any of even numbered 
SEQ ID NOs:2-16, applicants envision substitution of codons that encode 
biologically equivalent amino acids, as described herein, into a sequence example 
of even, numbered SEQ ID NOs: 2-16, respectively. Thus, applicants are in 
possession of amino acid and nucleic acids sequences which include such 
substations but which are not set forth herein in their entirety for convenience 

Alternatively, functionally equivalent proteins or peptides can be created via 
the application of recombinant DNA technology, in which changes in the protein 
structure can be engineered, based on considerations of the properties of the 
am.no acids being exchanged, e.g. substitution of Me for Leu. Changes designed 
by man can be introduced through the app.ication of site-directed mutagenesis 
techn,ques, e.g., to introduce improvements to the antigenicity of the protein or to 
test an engineered mutant polypeptide of the present invention in order to 
modulate lipid-binding or other activity, at the molecular level. 

Amino acid substitutions, such as those which might be employed in 
mod,fy,ng an engineered mutant polypeptide of the present invention are 
generally, but not necessarily, based on the relative similarity of the amino acid 
s.de-chain substituents, for example, their hydrophobic^, hydrophi.icity, charge 
s-ze, and the like. An analysis of the size, shape and type of the amino acid side- 
cha,n substituents reveals that arginine, lysine and histidine are all positively 
charged residues; that alanine, glycine and serine are all of similar size; and that 
pheny.a.anine, tryptophan and tyrosine all have a generally similar shape 
Therefore, based upon these considerations, arginine, lysine and histidine- 
alamne, glycine and serine; and phenylalanine, tryptophan and tyrosine- are 
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defined herein as biologically functional equivalents. Those of skill in the art will 
appreciate other biologically functionally equivalent changes. It is implicit in the 
above discussion, however, that one of skill in the art can appreciate that a 
radical, rather than a conservative substitution is warranted in a given situation. 
5 Non-conservative substitutions in engineered mutant LBD polypeptides of the 
present invention are also an aspect of the present invention. 

In making biologically functional equivalent amino acid substitutions, the 
hydropathic index of amino acids can be considered. Each amino acid has been 
assigned a hydropathic index on the basis of their hydrophobicity and charge 

10 characteristics, these are: isoleucine (+ 4.5); valine (+ 4.2); leucine (+ 3.8); 
phenylalanine (+ 2.8); cysteine (+ 2.5); methionine (+ 1.9); alanine (+ 1.8); glycine 
(-0.4); threonine (-0.7); serine (-0.8); tryptophan (-0.9); tyrosine (-1.3); proline (- 
1.6); histidine (-3.2); glutamate (-3.5); glutamine (-3.5); aspartate (-3.5); 
asparagine (-3.5); lysine (-3.9); and arginine (-4.5). 

15 The importance of the hydropathic amino acid index in conferring 

interactive biological function on a protein is generally understood in the art (Kyte 
& Doolittle , (1982), J. Mol. Biol. 157: 105-132, incorporated herein by reference). 
It is known that certain amino acids can be substituted for other amino acids 
having a similar hydropathic index or score and still retain a similar biological 

20 activity. In making changes based upon the hydropathic index, the substitution of 
amino acids whose hydropathic indices are within ±2 of the original value is 
preferred, those which are within ±1 of the original value are particularly preferred, 
and those within ±0.5 of the original value are even more particularly preferred. 

It is also understood in the art that the substitution of like amino acids can 

25 be made effectively on the basis of hydrophilicity. U.S. Patent No. 4,554,101, 
incorporated herein by reference, states that the greatest local average 
hydrophilicity of a protein, as governed by the hydrophilicity of its adjacent amino 
acids, correlates with its immunogenicity and antigenicity, i.e. with a biological 
property of the protein. It is understood that an amino acid can be substituted for 

30 another having a similar hydrophilicity value and still obtain a biologically 
equivalent protein. 

As detailed in U.S. Patent No. 4,554,101 , the following hydrophilicity values 
have been assigned to amino acid residues: arginine (+ 3.0); lysine (+ 3.0); 
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aspartate (+ 3.0±1); glutamate ( + 3.0±1); serine (+ 0.3); asparagine <+ 0 2)- 
glutamine (* 0.2); glycine (0); threonine (-0.4); proline (-0.5±1); alanine (-0 5)- 
h.st,dine (-0.5); cysteine (-1.0); methionine (-1.3); valine (-1.5); leucine (-1 8) 
isoleucine (-1 .8); tyrosine (-2.3); phenylalanine (-2.5); tryptophan (-3.4). 

In making changes based upon similar hydrophilicity values, the 
substitution of amino acids whose hydrophilicity values are within ±2 of the original 
value is preferred, those which are within ±1 of the original value are particularly 
preferred, and those within ±0.5 of the original value are even more particularly 
preferred. 

While discussion has focused on functionally equivalent polypeptides 
arising from amino acid changes, it will be appreciated that these changes can be 
effected by alteration of the encoding DNA, taking into consideration also that the 
genetic code is degenerate and that two or more codons can code for the same 
amino acid. 

Thus, it will also be understood that this invention is not limited to the 
particular amino acid and nucleic acid sequences of any of SEQ ID NOs:1-16. 
Recombinant vectors and isolated DNA segments can therefore variously include 
an engineered NR, SR or GR or NR, SR or GR LBD mutant polypeptide-encoding 
region itself, include coding regions bearing selected alterations or modifications 
in the basic coding region, or include larger polypeptides which nevertheless 
comprise an NR, SR or GR or NR, SR or GR LBD mutant polypeptide-encoding 
regions or can encode biologically functional equivalent proteins or polypeptides 
which have variant amino acid sequences. Biological activity of an engineered 
NR, SR or GR or NR, SR or GR LBD mutant polypeptide can be determined, for 
25 example, by transcription assays known to those of skill in the art. 

The nucleic acid segments of the present invention, regardless of the 
length of the coding sequence itself, can be combined with other DNA sequences, 
such as promoters, enhancers, polyadenylation signals, additional restriction 
enzyme sites, multiple cloning sites, other coding segments, and the like, such 
that their overall length can vary considerably. It is therefore contemplated that a 
nucleic acid fragment of almost any length can be employed, with the total length 
preferably being limited by the ease of preparation and use in the intended 
recombinant DNA protocol. For example, nucleic acid fragments can be prepared 
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which include a short stretch complementary to a nucleic acid sequence set forth 
in any of odd numbered SEQ ID NOs:1-15, such as about 10 nucleotides, and 
which are up to 10,000 or 5,000 base pairs in length. DNA segments with total 
lengths of about 4,000, 3,000, 2,000, 1,000, 500, 200, 100, and about 50 base 
5 pairs in length are also useful. 

The DNA segments of the present invention encompass biologically 
functional equivalents of engineered NR, SR or GR, or NR. SR or GR LBD mutant 
polypeptides. Such sequences can rise as a consequence of codon redundancy 
and functional equivalency that are known to occur naturally within nucleic acid 

10 sequences and the proteins thus encoded. Alternatively, functionally equivalent 
proteins or polypeptides can be created via the application of recombinant DNA 
technology, in which changes in the protein structure can be engineered, based 
on considerations of the properties of the amino acids being exchanged. 
Changes can be introduced through the application of site-directed mutagenesis 

15 techniques, e.g., to introduce improvements to the antigenicity of the protein or to 
test variants of an engineered mutant of the present invention in order to examine 
the degree of binding activity, or other activity at the molecular level. Various site- 
directed mutagenesis techniques are known to those of skill in the art and can be 
employed in the present invention. 

20 The invention further encompasses fusion proteins and peptides wherein 

an engineered mutant coding region of the present invention is aligned within the 
same expression unit with other proteins or peptides having desired functions, 
such as for purification or immunodetection purposes. 

Recombinant vectors form important further aspects of the present 

25 invention. Particularly useful vectors are those in which the coding portion of the 
DNA segment is positioned under the control of a promoter. The promoter can be 
that naturally associated with an NR, SR or GR gene, as can be obtained by 
isolating the 5' non-coding sequences located upstream of the coding segment or 
exon, for example, using recombinant cloning and/or PCR technology and/or other 

30 methods known in the art, in conjunction with the compositions disclosed herein. 

In other embodiments, certain advantages will be gained by positioning the 
coding DNA segment under the control of a recombinant, or heterologous, 
promoter. As used herein, a recombinant or heterologous promoter is a promoter 
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that is not normally associated with an NR, SR or GR gene in its natura. 
env.ronment. Such promoters can include promoters iso.ated from bacteria, viral 
eukaryotic, or mammalian cells. Naturally, it will be important to employ a 
promoter that effectively directs the expression of the DNA segment in the ce,l 
type chosen for expression. The use of promoter and cel. type combinations for 
pro em expression is generally known to those of skill in the art of molecular 
biology (See^, Sambrpok^aL, (1989) Molecular Clonin.^ K^, 
Manual, Cold Spring Harbor Laboratory, Newark, specif incorporated 
herem by reference). The promoters employed can be constitutive or inducible 
and can be used under the appropriate conditions to direct high level expression 
of the introduced DNA segment, such as is advantageous in the large-scale 
production of recombinant proteins or peptides. One preferred promoter system 
contemplated for use in high-level expression is a T7 promoter-based system. 

— - Antibod,es to an Engineered NR S R or GR or MR, S R, GR LBD 

Mutant Polype ptide of the Present Invention 
The present invention also provides an antibody that specifically binds a 
engineered NR, SR or GR or NR. SR, GR LBD mutant polypeptide and methods 
to generate same. The term "antibody" indicates an immunoglobulin protein or 
functional portion thereof, including a polyclonal antibody, a monoclonal antibody 
a ch.meric antibody, a single chain antibody, Fab fragments, and a Fab 
expression library. "Functional portion" refers to the part of the protein that binds 
a molecule of interest. In a preferred embodiment, an antibody of the invention is 
a monoclonal antibody. Techniques for preparing and characterizing antibodies 
are well known in the art (See, e.g., Harlow & Lane (1988) Antibodies: A 
Laboratory Manua l, Cold Spring Harbor Laboratory Press, Cold Spring Harbor 
New York). A monoclonal antibody of the present invention can be readily 
prepared through use of well-known techniques such as the hybridoma 
techmques exemplified in U.S. Patent No 4.196,265 and the phage-displayed 
30 techniques disclosed in U.S. Patent No. 5.260,203. 

The phrase "specifically (or selectively) binds to an antibody" or 
"specifically (or selectively) immunoreactive with", when referring to a protein or 
pept.de. refers to a binding reaction which is determinative of the presence of the 
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protein in a heterogeneous population of proteins and other biological materials. 
Thus, under designated immunoassay conditions, the specified antibodies bind to 
a particular protein and do not show significant binding to other proteins present in 
the sample. Specific binding to an antibody under such conditions can require an 
5 antibody that is selected for its specificity for a particular protein. For example, 
antibodies raised to a protein with an amino acid sequence encoded by any of the 
nucleic acid sequences of the invention can be selected to obtain antibodies 
specifically immunoreactive with that protein and not with unrelated proteins. 

The use of a molecular cloning approach to generate antibodies, 

10 particularly monoclonal antibodies, and more particularly single chain monoclonal 
antibodies, are also provided. The production of single chain antibodies has been 
described in the art. See, e.g., U.S. Patent No. 5,260,203. For this approach, 
combinatorial immunoglobu!in phagemid libraries are prepared from RNA isolated 
from the spleen of the immunized animal, and phagemids expressing appropriate 

15 antibodies are selected by panning on endothelial tissue. The advantages of this 
approach over conventional hybridoma techniques are that approximately 10 4 
times as many antibodies can be produced and screened in a single round, and 
that new specificities are generated by heavy (H) and light (L) chain combinations 
in a single chain, which further increases the chance of finding appropriate 

20 antibodies. Thus, an antibody of the present invention, or a "derivative" of an 
antibody of the present invention, pertains to a single polypeptide chain binding 
molecule which has binding specificity and affinity substantially similar to the 
binding specificity and affinity of the light and heavy chain aggregate variable 
region of an antibody described herein. 

25 The term "immunochemical reaction", as used herein, refers to any of a 

variety of immunoassay formats used to detect antibodies specifically bound to a 
particular protein, including but not limited to competitive and non-competitive 
assay systems using techniques such as radioimmunoassays, ELISA (enzyme 
linked immunosorbent assay), "sandwich" immunoassays, immunoradiometric 

30 assays, gel diffusion precipitation reactions, immunodiffusion assays, in situ 
immunoassays (e.g., using colloidal gold, enzyme or radioisotope labels), western 
blots, precipitation reactions, agglutination assays {e.g., gel agglutination assays, 
hemagglutination assays), complement fixation assays, immunofluorescence 
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assays. protein A assays, and Immunoelectrophoresis assays, etc. See Harlow & 
Lane (1988) for a description of immunoassay formats and conditions. 

— Mgthodfor Detecting an Engineered NR. SR or GR nr m P co gR 

LBD Mutant Polypepti de^ an Nucleic Acid Malo.rm* Encoding the 
Same 

In another aspect of the invention, a method is provided for detecting a 
level of an engineered NR, SR or GR or NR. SR, GR LBD mutant polypeptide 
us,ng an antibody that specifically recognizes an engineered NR. SR or GR or 
NR, SR, GR LBD mutant polypeptide, or portion thereof. In a preferred 
embod,ment, biological samples from an experimental subject and a control 
subject are obtained, and an engineered NR, SR or GR or NR, SR GR LBD 
mutant polypeptide is detected in each sample by immunochemical reaction with 
the antibody. More preferably, the antibody recognizes amino acids of any one of 
the even-numbered SEQ ID NOs:4, 6, 8, 12, 14, and 16, and is prepared 
accord,ng to a method of the present invention for producing such an antibody 

In one embodiment, an antibody is used to screen a biological sample for 
the presence of an engineered NR, SR or GR or NR, SR, GR LBD mutant 
polypept.de. A biological sample to be screened can be a biological fluid such as 
extracellular or intracellular fluid, or a cell or tissue extract or homogenate A 
b.o.ogica. sample can also be an isolated cell (e.g., in culture) or a collection of 
cells such as in a tissue sample or histology sample. A tissue sample can be 
suspended in a liquid medium or fixed onto a solid support such as a microscope 
sl.de. In accordance with a screening assay method, a biological sample is 
exposed to an antibody immunoreactive with an engineered NR, SR or GR or NR 
SR, GR LBD mutant polypeptide whose presence is being assayed, and the 
format.cn of antibody-polypeptide complexes is detected. Techniques for 
detect,ng such antibody-antigen conjugates or complexes are well known in the 
art and .nclude but are not limited to centrifugation, affinity chromatography and 
the hke, and binding of a labeled secondary antibody to the antibody-candidate 
receptor complex. 

In another aspect of the invention, a method is provided for detecting a 
nucleic acid molecule that encodes an engineered NR, SR or GR or NR, SR, GR 
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LBD mutant polypeptide. According to the method, a biological sample having 
nucleic acid material is procured and hybridized under stringent hybridization 
conditions to an engineered NR, SR or GR or NR, SR, GR LBD mutant 
polypeptide-encoding nucleic acid molecule of the present invention. Such 
5 hybridization enables a nucleic acid molecule of the biological sample and an 
engineered NR, SR or GR or NR, SR, GR LBD mutant polypeptide encoding- 
nucleic acid molecule to form a detectable duplex structure. Preferably, the an 
engineered NR, SR or GR or NR, SR, GR LBD mutant polypeptide encoding- 
nucleic acid molecule includes some or all nucleotides of any one of the odd- 
10 numbered SEQ ID NOs:3, 5, 7, 11, 13, and 15. Also preferably, the biological 
sample comprises human nucleic acid material. 

XL The Role of the Three-Dimensional Structure of the GRct LDB in Solving 
Additional NR, SR or GR Crystals 

15 Because polypeptides can crystallize in more than one crystal form, the 

structural coordinates of a GRa LBD, or portions thereof, as provided by the 
present invention, are particularly useful in solving the structure of other crystal 
forms of GRa and the crystalline forms of other NRs, SRs and GRs. The 
coordinates provided in the present invention can also be used to solve the 

20 structure of NR, SR or GR and NR, SR or GR LBD mutants (such as those 
described in Sections IX and X above), NR, SR or GR LDB co-complexes, or of 
the crystalline form of any other protein with significant amino acid sequence 
homology to any functional domain of NR, SR or GR. 

25 XI.A. Determining the Three-Dimensional Structure of a Polypeptide Using 

the Three-Dimensional Structure of the GRa LBD as a Template in 
Molecular Replacement 
One method that can be employed for the purpose of solving additional GR 
crystal structures is molecular replacement. See generally , Rossmann , ed, (1972) 
30 The Molecular Replacement Method , Gordon & Breach, New York. In the 
molecular replacement method, the unknown crystal structure, whether it is 
another crystal form of a GRa or a GRa LBD, (i.e. a GRa or a GRa LBD mutant), 
or an NR, SR or GR or an NR, SR or GR LBD polypeptide complexed with 
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another compound (a "co-complex"), or the crystal of some other protein with 
s.gn,ficant amino acid sequence homology to any functional region of the GRa 
LBD, can be~determined using the GRa LBD structure coordinates provided in 
Table 4. This method provides an accurate structural form for the unknown 
crystal more quickly and efficiently than attempting to determine such information 
ab initio. 

In addition, in accordance with this invention, NR, SR or GR and NR SR or 
GR LBD mutants can be crystallized in complex with known modulators The 
crystal structures of a series of such complexes can then be solved by molecular 
replacement and compared with that of the wild-type NR, SR or GR or the wild- 
type NR, SR or GR LBD. Potential sites for modification within the various 
b,nd,ng sites of the enzyme can thus be identified. This information provides an 
add.t,onal tool for determining the most efficient binding interactions, for example 
mcreased hydrophobic interactions, between the GRa LBD and a chemical entity 
15 or compound. 

All of the complexes referred to in the present disclosure can be studied 
using X-ray diffraction techniques (See, e.g., Blundel. & Johnson (1985) 
Method.Enzymo,., 114A & 115B, (Wyckoff etaL, eds.), Academic Press; McRee 
(1 " 3) ^cal Protein Crystallography, Academic Press, New York) and can be 
refined using computer software, such as the X-PLOR™ program (BrOnger 
(1992) X-PLOR, Version 3.1. A System for X-ray Crystallography and NMR, Yale 
University Press, New Haven, Connecticut; X-PLOR is available from Molecular 
Simulations, Inc.. San Diego, California) and the XTAL-VIEW program (McRee 
(1992) J. Mot. Graphics 10: 44-46; McRee, (1993) Practical Protein 
Crystallography , Academic Press, San Diego, California). This information can 
thus be used to optimize known classes of GR and GR LBD modulators, and 
more importantly, to design and synthesize novel classes of GR and GR LBD 
modulators. 
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Laboratory Exam ples 
The following Laboratory Examples have been included to illustrate 
preferred modes of the invention. Certain aspects of the following Laboratory 
Examples are described in terms of techniques and procedures found or 
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contemplated by the present inventors to work well in the practice of the invention. 
These Laboratory Examples are exemplified through the use of standard 
laboratory practices of the inventors. In light of the present disclosure and the 
general level of skill in the art, those of skill will appreciate that the following 
5 Laboratory Examples are intended to be exemplary only and that numerous 
changes, modifications and alterations can be employed without departing from 
the spirit and scope of the invention. 

Example 1 

10 Construction of the Modified pET24 Expression Vector 

The expression vector pGEX-2T (Amersham Pharmacia Biotech, 
Piscataway, New Jersey) was used as a template in a polymerase chain reaction 
to engineer a polyhistidine tag in frame to the sequence encoding glutathione S- 
transferase (GST) and a thrombin protease site. The forward primer contained a 

15 Nde I site (5" CGG CGG CGC CAT ATG AAA AAA GGT (CAT ) 6 GGT TCC CCT 
ATA CTA GGT TAT TGG A 3') (SEQ ID NO:19) and the reverse primer (5' CGG 
CGG CGC GGA TCC ACG CGG AAC CAG ATC CGA 3*) (SEQ ID NO:20) 
contained a BamH I site which allowed for direct cloning of the amplfied product 
into pET24a (Novagen, Inc., Madison, Wisconsin) following restiction enzyme 

20 digestion. The resulting sequence of the modified GST (SEQ ID NO:21)(last six 
residues are thrombin protease site) is-below: 

MKKGHHHHHH HGSPILGYWK IKGLVQPTRL LLEYLEEKYE EHLYERDEGD 50 
KWRNKKFELG LEFPNLPYYI DGDVKLTQSM AIIRYIADKH NMLGGCPKER 100 
AEISMLEGAV LDIRYGVSRI AYSKDFETLK VDFLSKLPEM LKMFEDRLCH 150 
25 KTYLNGDHVT HPDFMLYDAL DVVLYMDPMC LDAFPKLVCF KKRIEAIPQI 200 
DKYLKSSKYI AWPLQGWQAT FGGGDHPPKS DLVPRGS 237 

Example 2 

30 Mutagenesis (F602S AND F602D) of Human GR Ligand Binding Domain (LBD) 
Two complimentary oligonucleotides for each desired mutation were 
constructed. The following sequences represent the oligonucleotides for the 
Phenylalanine 602 Serine mutation: 
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Forward Primer (F602S) (SEQ ID NO:22): 

5' TAC TCC TGG ATG TCC CTT ATG GCA TTT GCT CT 3' 

Reverse Primer (F602S) (SEQ ID NO:23): 

5' AG AGC AAA TGC CAT AAG GGA CAT CCA GGA GTA 3' 

Another separate mutation was also constructed. The sequences below 
represent the oligonucleotides for the Phenylalanine 602 Aspartic Acid mutation: 

Forward Primer (F602D) (SEQ ID NO:24): 

5' TAC TCC TGG ATG GAC CTT ATG GCA TTT GCT CT 3' 

Reverse Primer (F602D) (SEQ ID NO:25): 

5' AG AGC AAA TGC CAT AAG GTC CAT CCA GGA GTA 3 1 

The underlined letters depict the base changes from the wild type human 
GR sequence. The GR LBD (amino acids 521-777) (SEQ ID NOs:9-10) 
previously cloned into the pRSET A vector (Invitrogen of Carlsbad, California) was 
used as the backbone to create the mutants. The procedure used to make the 
mutation is outlined in the QuickChange Site-Directed Mutagenesis Kit sold by 
Stratagene, La Jolla, California (Catalog # 200518). After the constructs were 
sequence verified, the mutants of GR-LBD were subcloned inframe with the 
glutathione S-transferase in the modified pET24 expression vector. A thrombin 
protease site at the C-terminus of the glutathione S-transferase allows for 
cleavage of the resultant fusion protein following expression. 

The resulting final amino acid sequences for the mutant GR LBDs are 
below. The underlined, bolded amino acids depict the changes from the wild type 
human GR sequence. 



30 GR-LBD(521-777) F602S (SEQ ID NO:12) 

VPATLPQLTP TLVSLLEVIE PEVLYAGYDS SVPDSTWRIM TTLNMLGGRQ 
VIAAVKWAKA IPGFRNLHLD DQMTLLQYSW MSLMAFALGW RSYRQSSANL 
LCFAPDLI IN EQRMTLPCMY DQCKHMLYVS SELHRLQVSY EEYLCMKTLL 
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LLSSVPKDGL KSQELFDEIR MTYIKELGKA IVKREGNSSQ NWQRFYQLTK 
LLDSMHEVVE NLLNYCFQTF LDKTMSIEFP EMLAEIITNQ IPKYSNGNIK 
KLLFHQK 

« 

5 

GR-LBD(52 1-777) F602D (SEQ ID NO:14) 

VPATLPQLTP TLVSLLEVIE PEVLYAGYDS SVPDSTWRIM TTLNMLGGRQ 
VIAAVKWAKA IPGFRNLHLD DQMTLLQYSW MDLMAFALGW RSYRQSSANL 
10 LCFAPDLI IN EQRMTLPCMY DQCKHMLYVS SELHRLQVSY EEYLCMKTLL 
LLSSVPKDGL KSQELFDEIR MTYIKELGKA IVKREGNSSQ NWQRFYQLTK 
LLDSMHEVVE NLLNYCFQTF LDKTMSIEFP EMLAEIITNQ IPKYSNGNIK 
KLLFHQK 

15 Example 3 

Expression of the Fusion Protein 
BL21(DE3) cells (Novagen, Inc., Madison, Wisconsin) were transformed 
following established protocols. Following overnight incubation at 37°C a single 
colony was used to inoculate a 10 ml LB culture containing 50 ng/ml kanamycin 

20 (Sigma Chemical Company, St. Louis, Missouri). The culture was grown for -12 
hrs at 37°C and then a 500|al aliquot was used to inoculate flasks containing 1 liter 
Circle Grow media (Bio101, Inc., now Qbiogene of Carlsbad, California) and the 
required antibiotic. The cells were then grown at 22°C to an OD600 between 1 
and 2 and then cooled to 16°C. Following a 30 min equilibration at that 

25 temperature, dexamethasone (Spectrum, Gardena, California) (10 jaM final 
concentration) was added. Induction of expression was achieved by adding IPTG 
(BACHEM AG, Switzerland) (final concentration 1 mM) to the cultures. 
Expression at 16°C was continued for - 24 hrs. Cells were then harvested and 
frozen at-80°C. 

30 Referring now to Figure 1A, E. coli expression of mutant 6xHisGST- 

GR(521-777) F602S is shown. Shown are the pellet (P - insoluble) and eluent (E 

- soluble Ni++ binding) fractions of protein expressed in the absence of ligand (NL 

- lanes 2 and 3) or in the presence (10 micromolar) of dexamethasone (DEX), 
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lanes 4 and 5, or RU486, lanes 6 and 7. The positions of molecular mass (kDa) 
markers M (lane 1) (94, 67, 43, 30, 20 and 14 kDa, respectively) and of the 
expressed protein are indicated to the left and right sides of the panel, 
respectively. 

Referring now to Figure 1B, E coli expression of mutant SxHisGST- 
GR(521-777) F602D is shown. Shown are eluent fractions from Ni++ chelated 
resin of two separate samples. Protein was expressed in either the presence (♦ 
lanes 2 and 4, 10 micromolar) or absence (-, lanes 3 and 5) of dexamethasone. 
The pos.t.ons of molecular mass (kDa) markers M (lane 1) (94, 67, 43, 30, 20 and 
14 kDa, respectively) and of the expressed protein are indicated to the left and 
right sides of the panel, respectively. 
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Example 4 
Purification Of GR-UBD (F602S) 
-200 g cells were resuspended in 700mL lysis buffer (50mM Tris pH =8 0 
150 mM NaCI, 2M Urea, 10% glycerol and 100 jiM dexamethasone) and lysed by 
passing 3 times through an APV Lab 2000 homogenizer. The lysate was 
subjected to centrifugation (45 minutes, 20.000g, 4X), followed by a second 20 
min spin at 20,000 g, 4°. The cleared supernatant was filtered through coarse pre- 
filters and 50 mM Tris, pH= 8.0, containing 150 mM NaCI, 10% glycerol and 1M 
imidazole was added to obtain a final imidazole concentration of 50mM This 
lysate was loaded onto a XK-26 column (Pharmacia, Peapack, New Jersey) 
packed with SEPHAROSE® [Ni" charged] Chelation resin (Pharmacia, Peapack, 
New Jersey) and pre-equilibrated with lysis buffer supplemented with 50mM 
imidazole. Following loading, the column was washed to baseline absorbance 
with equilibration buffer and a linear urea gradient (2M to 0). For elution the 
column was developed with a linear gradient from 50 to 500 mM Imidazole in 
50mM Tris pH =8.0, 150 mM NaCI. 10% glycerol and 30 fiM dexamethasone. 
Column fractions of interest were pooled and 500 units of thrombin protease 
(Amersham Pharmacia Biotech. Piscataway. New Jersey) were added for the 
cleavage of the fusion protein. 

This solution was then dialyzed against 1 liter of 50 mM Tris pH =8.0, 150 
mM NaCI, 10% glycerol and 20 n M dexamethasone for -10 hrs at 4°C. The 
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digested protein sample was filtered and then reloaded onto the same re- 
equilibrated column. The cleaved GR-LBD was collected in the flow through 
fraction. The diluted protein sample was concentrated with Centri-prep™ 10K 
centrifugal filtration devices (Amicon/Millpore, Bedford, Massachusetts) to a 
volume of 30mls and then diluted 5 fold with 50 mM Tris pH=8.0, 10 % glycerol. 
10 mM DTT, 0.5 mM EDTA and 30 uM dexamethasone. The sample was then 
loaded onto a pre-equilibrated XK-26 column (Pharmacia, Peapack, New Jersey) 
packed with Poros HQ resin (PerS-sptive Biosystems, Framingham, 
Massachusetts). The cleaved GR LBD was collected in the flowthrough. The 
NaCI concentration was adjusted to 500mM and the dexamethasone 
concentration was adjusted to 50 \iM before the purified protein was concentrated 
to ~1 mg/ml using the Centri-prep™ 10K centrifugal filtration devices. 

Figure 1 A depicts purification of E. coli expressed GR(521-777) F602S by 
SDS-PAGE. Lane 1 contains the insoluble pellet fraction. Lane 2 contains the 
soluble supernatant fraction. Lane 3 contains pooled eluent from intial Ni ++ 
column. Lane 4 contains the sample after thrombin digestion. Lane 5 contains 
the flow through fraction after reload of the Ni ++ column. Lane 6 contains the 
concentrated protein after anion exchange. The positions of molecular mass (kDa) 
markers (in Lane M, 94, 67, 43, 30, 20 and 14 kDa, respectively) and of the 
expressed protein are indicated to the left and right sides of the panel, 
respectively. Purfication provides for the removal of any remaining associated 
bacterial HSPs. 

The final resultant sequence (SEQ ID NO:32) of the purified protein is 
below. The first two residues (underlined and bolded) are vector derived and 
represent the remaining residues of the thrombin cleavage site following digestion. 

GSVPATLPQL TPTLVSLLEV IEPEVLYAGY DSSVPDSTWR IMTTLNMLGG 

RQVIAAVKWA KAIPGFRNLH LDDQMTLLQY SWMSLMAFAL GWRSYRQSSA 

NLLCFAPDLI INEQRMTLPC MYDQCKHMLY VSSELHRLQV SYEEYLCMKT 

LLLLSSVPKD GLKSQELFDE IRMTYIKELG KAIVKREGNS SQNWQRFYQL 

TKLLDSMHEV VENLLNYCFQ TFLDKTMSIE FPEMLAEIIT NQIPKYSNGN 
IKKLLFHQK 
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Example 5 
Ligand and Coactivator Binding Of GR 
All experiments were conducted with buffer containing 10 mM HEPES pH 
7.4, 0.15 M NaCI. 3 mM EDTA, 0.005% polysorbate-20 and 5 mM DTT. For 
activity determinations, 10 nM of fluorescein dexamethasone (Molecular Probes. 
Eugene, Oregon) was titrated with increasing concentrations of the glucocorticoid 
receptor in black 96-well plates (CoStar, Cambridge, Massachusetts). The 
fluorescence polarization values for each concentration of receptor were 
determined using a BMG PolarStar Galaxy fluorescence plate reader (BMG 
Labtechnologies GmbH, Offenburg, Germany) with 485 nm excitation and 520 nm 
emission filters. Binding isotherms were constructed and apparent EC50 values 
were determined by non-linear least squares fit of the data to an equation for a 
simple 1:1 interaction. Note that these EC50 values are not corrected for the 
unlabeled dexamethasone present in the GR receptor preparations. For stability 
studies, the fluorescent polarization of 10 nM fluorescein dexamethasone with 1 
uM GST-GR LBD 521-777 (F602S) is read at specific time intervals in the 
presence or absence of 25 uM of a peptide derived from the coactivator TIF2. 
(TIF2 732-756: QEPVSPKKKENALLRYLLDKDDTKD) (SEQ ID NO:17). 

Data from these experiments are presented graphically in Figures 2A-2C. 
These studies demonstrate that the GST-GR fusion protein and the cleaved GR 
LBD alone bind dexamethasone in a saturable and competable manner (Figure 
2A). It was also found that the GST-GR fusion protein binds a peptide from the 
coactivator TIF2 with a submicromolar affinity. Binding of the GST-GR fusion 
protein is enhanced by the agonist dexamethasone (DEX) and inhibited by the 
antagonist RU486 (Fig. 2B). Finally, it was also found that the addition of the TIF2 
peptide stabilizes the dexamethasone binding activity of the GST-GR fusion 
protein. 

Figure 2B was generated using Biacore techniques. Biacore relies on 
changes in the refractive index at the surface layer upon binding of a ligand to a 
protein immobilized on the layer. In this system, a collection of small ligands is 
injected sequentially in a 2-5 microliter cell, wherein the protein is immobilized 
within the cell. Binding is detected by surface plasmon resonance (SPR) by 
recording laser light refracting from the surface. In general, the refractive index 
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change for a given change of mass concentration at the surface layer is practically 
the same for all proteins and peptides, allowing a single method to be applicable 
for any protein (Liedberg et al. (1983) Sensors Actuators 4:299-304; Malmquist 
(1993) Nature 361:186-187). The purified protein is then used in the assay 
5 without further preparation. A synthetic peptide with an amino-terminal biotin is 
coupled to a sensor chip immobilized with streptavidin. The chip thus prepared is 
then exposed to the potential ligand via the delivery system incorporated in the 
instruments sold by Biacore (Uppsala, Sweden) to pipet the ligands in a 
sequential manner (autosampler). The SPR signal on the chip is recorded and 
10 changes in the refractive index indicate an interaction between the immobilized 
target and the ligand. Analysis of the signal kinetics of on rate and off rate allows 
the discrimination between non-specific and specific interaction. 

Example 6 

15 Preparation of the GR/TIF2/Dex Complex 

The GR/TIF2/Dex complex was prepared by adding a 2-fold excess of a 
TIF2 peptide containing sequence of QEPVSPKKKENALLRYLLDKDDTKD (SEQ 
ID NO:17). The above complex was diluted 10 folds with a buffer containing 500 
mM ammonium acetate (NH 4 OAC), 50 mM Tris, pH 8.0, 10% glycerol, 10 mM 

20 dithiothreitol (DTT), 0.5 mM EDTA, and 0.05% beta-N-octoglucoside (b-OG), and 
was slowly concentrated to 6.3 mg/ml, then aliquoted and stored at -80°C. 

Example 7 
Crystallization and Data Collection 
25 The GR/TIF2/DEX crystals were grown at room temperature in hanging 

drops 

containing 3.0 ul of the above protein-ligand solutions, and 0.5 ul of well 
buffer (50mM HEPES, pH 7.5-8.5 (preferred pH range is 8.0 to 8.5), and 1.7-2.3M 
ammonium formate). Crystals were also obtained with mixing of the above protein 
30 solution and the well buffer at various volume ratios. Crystals appeared overnight 
and 

continously grew to a size up to 300 micron within a week. Before data collection, 
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oystals were transiently mixed with the we.l buffer that contained an additional 25 
% glycerol, and were then flash frozen in liquid nitrogen. 

The GR/TIF2/DEX crystals formed in the P6, space group, with a = b = 
126.014 A. c = 86.312 A, a = p =90°, and y =120°. Each asymmetry unit contains 
two molecules of the GR LBD with 56% of solvent content. Data were collected 
with a Rigaku Raxis IV detector in house. The observed reflections were reduced 
merged and scaled with DENZO and SCALEPACK in the HKL2000 package (Z 
Otwinowski and W. Minor (1997)). 

Example 8 
Structure De termination and Refinement 

Table 5 is a table of the atomic structure coordinates used as the initial 
model to solve the structure of the GRrriF2/dexamethasone complex by 
molecular replacement. The GR model is a homology model built on the 
published structure of the progesterone receptor LBD and the SRC1 coactivatbr 
peptide from the PPARa/Compound 1/SRC1 structure. 

Compound 1 is an agonist of hPPARcc, and has the IUPAC name 2-methyl- 
2-[4-{[(4-methyl-2-[4-trifluoromethylphenyl] thiazol-5-yl-carbonyl) amino] methyl} 
phenoxy] propionic acid. 




Compound 1 

The initial model for the molecular replacement calculation comprised 
coordinates for residues 527-776 of wild-type GR together with coordinates for 
residues 685-697 of SRC-1. a coactivator very similar to TIF2. The model for GR 
was built from the crystal struture of PR bound to progesterone (Shawn P 
Williams and Paul B. Sigler, Nature 393, 392-396 (1998)) using the MVP program 
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(Lambert, 1997). The coordinates for SRC-1 were obtained from a crystal 
structure of PPARa bound to SRC-1. The SRC-1 model was positioned in the 
coactivator binding site of GR by rotating the GR model and PPARa/SRC-1 
complex into a common orientation that superimposed their backbone atoms. 
5 It is noted that the amino acid sequence for SRC-1 differs substantially 

from that of TIF2, although both coactivator sequences have the LXXLL motif. 
Model building, including conversion of side-chains from the SRC-1 and wild-type 
GR sequences to the actual TIF2 and GR F602S sequences, respectively, was 
carried out with QUANTA™. 

10 This model was used in molecular replacement search with the CCP4 

AmoRe™ program (Collaborative Computational Project Number 4, 1994, "The 
CCP4 Suite: Programs for Protein Crystallography", Acta Cryst D50, 760-763; 
J.Navaza, Acta Cryst A50, 157-163 (1994)) to determine the initial structure 
solutions. Two solutions were obtained from the molecular replacement search 

15 with a correlation coefficiency of 43% and an R-factor of 45.3%, consistent with 
two complexes within each asymmetry unit. The calculated phase from the 
molecular replacement solutions was improved with solvent flattening, histogram 
matching and the two-fold noncrystallographic averaging as implement in the 
CCP4 dm program, and produced a clear map for the GR LBD, the TIF2 peptide 

20 and the dexamethasone. As noted above, model building proceeded with 
QUANTA™, and refinement progressed with CNX (Accelrys, Princeton, New 
Jersey) and multiple cycle of manual rebuilding. The statistics of the structure are 
summarized in Table 3 and coordinates are presented in Figure 4. 

Surface areas calculated with the Connolly MS program (Michael L. 

25 Connolly, "Solvent-Accessible Surfaces of Proteins and Nucleic Acids," Science 
221, 709-713 (1983)) and the MVP program (Lambert, 1997). The pocket volume 
and binding site accessible waters were calculated with MVP. 

Example 9 

30 Random Mutant Library of GR LBD and Selection using the Lacl Fusion System 
The expression vector pJS142A (Affymax Inc., Palo Alto, California) 
containing the Lacl protein was used to clone the wild type GR LBD in frame with 
the Lacl gene. Using standard error-incorporating PCR techniques, a random 
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mutant library was created within the context of the GR LBD. An advantage of the 
Lad expression system is that the protein expressed has the ability to bind the 
plasmid DNA from which it was derived. The mutant fusion proteins produced by 
the random library were expressed in E.Coli at 37'C. Lysis of the cell cultures 
was ach,eved using lysozyme. The cell .ysates were then added to a microtiter 
Plate containing the immobilized coactivator peptide biotinylated-TIF2 NR Boxlll 
The plasmid DNA was eluted from the DNA-protein complex bound to the plate 
using 1mM IPTG (Life Technologies). The eluted DNA was then re-transformed 
and ,nd.v,dual clones were isolated for sequence analysis. Mutant fusion proteins 
wrth increased solubility and activity (ability to bind coactivator) should be selected 
for after rounds of panning and increased stringency washes. Once the sequence 
of the mutant Lacl-GR LBD was identified, the same mutation was also made in 
the pET24 expression vector (see Example 1). The expression and partial 
purification of the mutant Lacl-derived GST-GR LBD fusion proteins were 
performed in the same manner as described in Examples 3 and 4. 

Figure 1 D depicts the partial purification of E. Coli expressed GR (521-777) 
for several mutants isolated by the Lad Fusion system. For solubility testing 
these mutants are expressed as a fusion to 6xHis-GST using the modified pET24 
express,on vector. Continuing with Figure 1D, Lane 1 contains the soluble 
fraction of GST-GR (521-777) F602S, Lane 2: GR (521-777) wild type, Lane 3- 
GST-GR (521-777) A580T/F602L, Lane 4: GST-GR (521-777) A574T Lane 5- 
GST-GR (521-777) Q615H, and Lane 6: GST-GR (521-777) Q615L. Molecular 
weight markers (kD) are shown in Lane M. 
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Table 3 

Statistics of Crystallographic Data and Structure 



Crystals 



GR/TIF2 with 
dexamethasone 



Space group P61 
resolution (A) 20.0- 2.8 

Unique reflections ( N ) 18,923 



completeness (%) 
l/a (last shell) 



Rsym 3 (%) 



refinement statistics 
R factor 13 (%) 
R free (%) 
r.m.s.d. 

bond lengths (A) 
r.m.s.d. bond 
angles(degrees) 
Number of H20 
total non-hydrogen 
atoms 



99.7 

25.6 (2.2) 
8.5 

33.4 
29.6 

0.015 

1.795 
53 

4444 



r.m.s.d is the root mean square deviation from ideal 
geometry. 

a Rsym =1 1 \avg - 1/ 1 / 11/ 

b Rfactor= £| F P - Fpcaic | / IF p , where F p and F pca ic are 
observed and calculated structure factors, R fre e is 
calculated from a randomly chosen 8% of reflections 
that never be used in refinement and R factor is 
calculated for the remaining 92% of reflections. 
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TABLE 4 

ATOMIC STRUCTURE COORDINATE DATA OBTAINED FROM X-RAY 
DIFFRACTION FROM THE LIGAND BINDING DOMAIN OF GRot IN COMPLEX 

WITH DEXAMETHASONE 



ATOM 


ATOM 
TYPE 


RESIDUE 


rKU TEIN 
# 


# 


X 


Y 


OCC 


B 


1 


CB 


GLN 


^97 


60.207 


9.806 


35.497 


1.00 


60.77 


2 


CG 


GLN 


^97 
Oil f 


C/"\ cr\ a 

60.501 


1 1.318 


35.564 


1.00 


60.74 


3 


CD 


GLN 


*%97 


60.595 


1 1 .993 


34.172 


1.00 


63.52 


4 


OE1 


GLN 


# 


60.493 


13.224 


34.058 


1.00 


61.80 


5 


NE2 


GLN 


*\97 


cr\ ~7r\ a 

60.794 


11.187 


33.121 


1.00 


61.21 


6 


c 


GLN 


R97 


62.073 


8.590 


36.647 


1.00 


62.83 


7 


0 


GLN 


*^97 


63.240 


8.191 


36.724 


1.00 


59.67 


8 


N 


Gl N 


/ 


c a r\r\f\ 

61.009 


7.618 


34.618 


1.00 


58.91 


9 


CA 


Gl N 


C07 
/ 


61.426 


8.890 


35.289 


1.00 


62.13 


10 


N 


LFU 




c a 000 

61.308 


8.776 


37.716 


1.00 


62.73 


11 


CA 


LFU 




£J A OHO 

61.816 


8.538 


39.064 


1.00 


65.02 


12 


CB 


LEU 

l— l— w 




62.105 


9.889 


39.733 


1.00 


62.65 


13 


CG 


LEU 




62.864 


10.872 


38.813 


1.00 


59.23 


14 


CD1 


LEU 


^9A 


co r\~i a 

62.071 


12.198 


38.675 


1.00 


63.52 


15 


CD2 


LEU ^ 


^9A 


a a 000 
64.283 


A A A f\r~ 

1 1 . 1 05 


39.356 


1.00 


60.04 


16 


c 


LEU 




en 000 
DU.o2o 


7.690 


39.888 


1.00 


59.38 


17 


o 


LEU 

l— L_ W 




cn coc 


6.527 


39.527 


1.00 


63.35 


18 


N 


THR 

I 1 1 1 X 


^9Q 


DU.24/ 


0 oco 

8.256 


40.960 


1.00 


60.40 


19 


CA 


THR 




C^Q OP.O 

oy.zo2 


"7 con 
f .Oo9 


A A O O 

41.835 


1.00 


60.79 


20 


CB 


THR 


52Q 


^7 P/H 


O 00"7 


A A Q A ~7 

41 .847 


1.00 


63.67 


21 


OG1 


THR 




^7 Q<1 O 


y.ooi 


42.382 


1.00 


60.60 


22 


CG2 


THR 


529 


cc oc-7 
OD.OO / 


f .41 U 


42.706 


1.00 


62.04 


23 


C 


THR 


529 1 




D.UOD 




a r\r\ 

1.00 


61.38 


24 


O 


THR 


529 


58.454 


5.754 
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-5.048 



-5.952 



26.645 



25.652 



26.584 



26.608 



26.240 



26.788 



26.139 



26.719 



26.350 



25.560 



26.436 



27.420 



26.138 



28.255 



-4.970 



-3.910 



-6.003 



-5.468 



-4.015 



-7.235 



-7.143 



-8.386 



-9.572 



62.685 



63.697 



58.599 



58.705 



57.550 



1.00 



1.00 



1.00 



1.00 



57.166 



56.516 



55.434 



55.668 



57.172 



58.271 



56.523 



■10.834 



-10.608 



-10.142 



10.844 



-9.084 



28.850 



28.891 



30.336 



30.777 



32.255 



33.383 



33.649 



30.884 



31.963 



30.163 



30.627 



30.187 



31.214 



31.713 



30.567 



30.114 

30.779 

28.927 

28.337 

26.963 

29.287 

29.619 

29.724 

30.632 

30.674 

31.733 

31.348 

30.318 

32.075 



-9.389 



57.122 



56.311 



55.002 



53.837 



53.519 



53.240 



-9.561 



-9.004 



-8.808 



-8.060 



-7.666 



57.219 



1.00 



1.00 



1.00 



65.31 



59.87 



60.75 



62.03 



61.23 



60.04 



61.34 



1.00 



1.00 



1.00 



1.00 



1.00 



1.00 



1.00 



64.04 



60.17 



60.83 



59.53 



60.43 



61.33 



1.00 



1.00 



1.00 



1.00 



58.292 



56.115 



56.109 



54.857 



-9.076 



-9.386 



-8.072 



-8.394 



•7.067 



-6.323 



-4.862 



-3.835 



-4.171 



-2.471 



6.946 



-6.885 
-7.541 
8.194 
-8.775 
-9.313 
-9.482 
-10.044 
11.194 
11.851 
12.933 
14.560 
-14.334 
14.728 



54.898 



1.00 



62.90 



60.51 



57.86 



60.97 



59.16 



1.00 



1.00 



1.00 



1.00 



55.043 



53.307 



57.323 



57.811 



57.796 



58.957 



58.881 



58.397 



57.010 



58.411 



60.241 



61.273 

60.177 

61.343 

60.983 

61.747 

62.922 

60.722 

60.795 

59.407 

59.190 

59.811 

59.388 

60.719 



1.00 



1.00 



1.00 



1.00 



1.00 



1.00 



1.00 



1.00 



1.00 



1.00 



1.00 



1.00 



1.00 

1.00 

1.00 

1.00 

1.00 

1.00 

1.00 

1.00 

1.00 

1.00 

1.00 

1.00 

1.00 



58.48 



59.47 



64.26 



61.15 



63.84 



64.44 



67.15 



59.96 



62.38 



60.81 



59.19 



62.39 



61.65 



61.49 



61.11 



63.29 



58.84 



61.39 

62.56 

61.99 

59.37 

66.01 

63.54 

63.92 

59.42 

60.24 

62.29 

59.60 

60.67 

62.46 
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1862 


C 


GLU 


755 


32.068 


-10.912 


61.252 


1 00 


60 75 


1863 


o 


GLU 


755 


32.906 


-1 1 .809 


61.240 


1 00 


61 42 


1864 


N 


ILE 


756 


32.363 


-9.677 


61.636 


1 00 


60 43 


1865 


CA 


ILE 


756 


33.709 


-9.328 


62.074 


1.00 


65 41 


1866 


CB 


ILE 


756 


34.369 


-8.336 


61.114 


1 00 


58 24 


1867 


CG2 


ILE 


756 


35.741 


-7.964 


61.623 


1.00 


57.70 


1868 


CG1 


ILE 


756 


34.478 


-8.957 


59.729 


1 00 


59 57 


1869 


CD1 


ILE 


756 


35.178 


-8.090 


58.743 


1 00 


61.24 


1870 


C 


ILE 


756 


33.625 


-8.693 


63.439 


1.00 


63.18 


1871 


O 


ILE 


756 


34.373 


-9.043 


64.351 


1.00 


58.77 


1872 


N 


ILE 


757 


32.705 


-7.740 


63.548 


1.00 


59.89 


1873 


CA 


ILE 


757 


32.434 


-7.024 


64.785 


1.00 


61 .90 


1874 


CB 


ILE 


757 


31.115 


-6.254 


64.668 


1.00 


64.77 


1875 


CG2 


ILE 


757 


30.778 


-5.602 


65.991 


1 00 


62 61 


1876 


CG1 


ILE 


757 


31.224 


-5.237 


63.529 


1.00 


62.60 


1877 


CD1 


ILE 


757 


29.902 


-4.649 


63.097 


1.00 


59.02 


1878 


C 


ILE 


757 


32.298 


-8.069 


65.879 


1.00 


60.76 


1879 


O 


ILE 


757 


32.990 


-8.016 


66.890 


1.00 


60.98 


1880 


N 


THR 


758 


31.396 


-9.022 


65.660 


1.00 


59.72 


1881 


CA 


THR 


758 


31.184 


-10.104 


66.608 


1.00 


61.21 


1882 


CB 


THR 


758 


30.224 


-11.141 


66.017 


1.00 


58.63 


1883 


OG1 


THR 


758 


30.260 


-1 1 .028 


64.592 


1.00 


60.88 


1884 


CG2 


THR 


758 


28.792 


-10.925 


66.527 


1.00 


60.75 


1885 


C 


THR 


758 


32.540 


-10.770 


66.885 


1.00 


61.53 


1886 


O 


THR 


758 


33.167 


-10.549 


67.926 


1.00 


61.44 


1887 


N 


ASN 


759 


32.979 


-1 1 .572 


65.924 


1.00 


62.04 


1888 


CA 


ASN 


759 


34.240 


-12.305 


65.965 


1.00 


60.58 


1889 


CB 


ASN 


759 


34.426 


-13.001 


64.623 


1.00 


60.54 


1890 


CG 


ASN 


759 


33.242 


-12.774 


63.689 


1.00 


60.06 


1891 


OD1 


ASN 


759 


32.581 


-11.723 


63.736 


1.00 


59.27 


1892 


ND2 


ASN 


759 


32.976 


-13.747 


62.825 


1.00 


59.97 


1893 


C 


ASN 


759 


35.470 


-1 1 .432 


66.249 


1.00 


59.08 


1894 


O 


ASN 


759 


36.564 


-11.710 


65.734 


1.00 


63.02 


1895 


N 


GLN 


760 


35.282 


-10.388 


67.059 


1.00 


60.43 


1896 


CA 


GLN 


760 


36.336 


-9.442 


67.438 


1.00 


61.43 


1897 


CB 


GLN 


760 | 


36.620 


-8.446 


66.302 


1.00 


56.98 


1898 


CG 


GLN 


760 


37.445 


-8.966 


65.121 


1.00 


61.31 


1899 


CD 


GLN 


760 


38.839 


-9.446 


65.514 


1.00 


63.88 


1900 


OE1 


GLN 


760 


39.445 


-8.949 


66.463 


1.00 


59.43 


1901 


NE2 


GLN 


760 


39.356 


-10.409 


64.769 


1.00 


62.96 


1902 


C 


GLN 


760 


35.850 


-8.659 


68.651 


1.00 


59.31 


1903 


O 


GLN 


760 


36.625 


-8.353 


69.563 


1.00 


59.88 


1904 


N 


ILE 


761 


34.546 


-8.371 


68.649 


1.00 


61.08 


1905 


CA 


ILE 


761 


33.861 


-7.606 


69.704 


1.00 


64.40 


1906 


CB 


ILE 


761 


32.318 


-7.582 


69.469 


1.00 


60.77 


1907 


CG2 


ILE 


761 


31.759 


-8.983 


69.575 


1.00 


61.46 


1908 


CG1 


ILE 


761 


31.626 


-6.686 


70.500 


1.00 I 


61.86 


1909 


CD1 


ILE 


761 


30.115 


-6.703 


70.390 


1.00 


62.34 
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J910 
1911 

1912 
1913 
j914 
1915 
1916 
1917 
1918 
1919 
1920 
1921 
1922 
1923 
1924 
1925 
1926 
1927 
1928 
1929 
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1931 
1932 
1933 
1934 
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1936 
1937 
1938 
1939 
1940 
1941 
1942 
1943 
1944 
1945 
1946 
1947 
1948 
1949 
1950 
1951 
1952 
1953 
1954 
1955 
1956 
1957 



N 

CD 
CA 
CB 
CG 



N 

CA 
CB 
CG 
h CD~ 
CE 
NZ 



_N 
CA 
CB 
CG 
CD1 
CE1 
CD2 
CE2 
CZ 
OH 



_N 

CA 
CB 
OG 

_C 

C> 

_N 
CA 
CB 
CG 
QD1 
ND2 
C 
~Q 
N 

CA 



ILE 
ILE 
PRO 
PRO 
PRO 
PRO 
PRO 
PRO 
PRO 
LYS 
LYS 
LYS 
LYS 
LYS 
LYS 
LYS 
LYS 
LYS 
TYR 
TYR 
TYR 
TYR 
TYR 
TYR 
TYR 
TYR 
TYR 
TYR 
TYR 
TYR 
SER 
SER 
SER 
SER 
SER 
SER 
ASN 
ASN 
ASN 
ASN 
ASN 
ASN 
ASN 
ASN 
GLY 
GLY 
GLY 
GLY 



761 
761 
762 
762 
762 
762 
762 
762 
762 
763 
763 
763 
763 
763 
763 
763 
763 
763 
764 
764 
764 
764 
764 
764 
764 
764 
764 
764 
764 
764 
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765 
765 
765 
765 
765 
766 
766 
766 
766 
766 
766 
766 
766 
767 
767 
767 
767 
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^34.111 
33.900 
.34.543 
,34.581 
34.800 
34.718 
. 35.309 
36.160 
36.257 
.37.203 
38.580 
39.424 
40.926 
41.206 
40.834 
39.378 
39.256 
40.164 
38.820 
39.428 
38.399 
38.901 
38.708 
39.190 
39.600 
40.092 
39.876 
40.304 
40.204 
41.310 
39.653 
40.324 
39.780 
38.359 



40.207 
39.574 
40.833 
40.867 
40.390 
38.959 
38.593 
38.143 

.42.301 
42.503 
43.281 
44.689 

.45.366 
46.588 



-8.077 
-7.317 
-9.338 
-10.454 
-9.840 
-11.350 
-11.528 
-9.419 
-8.697 
-9.887 
-9.637 
-10.882 
-10.65 1 
-9.910 
-10.723 
-11.026 
-8.359 
-7.846 
-7.847 
-6.653^ 
-5.865 
-5.590 
-6.516 
I -6.287^ 
-4.422 
-4.183 
-5.118 
-4.863 
-5.654 
-5.963 
-4.443 
-3.419 
-1.999_ 
-1.934 



-3.760 

-3.046^ 

-4.884 

-5.404 

-6.867^ 

-7.018 

-6.456 

-7.782 

-5.305_ 

-4.889 

-5.673 

-5.653 

-6.867 

-6.912 



71.139 
72.087 
71.319 
70.353 
72.670 
72.482 
71.122 
73.230 
74.232 
72.558 
72.960 
72.594 
72.347 
71.025 
69.776 
69.623 
72.438 
73.086 
71 .289 
70.673 
69.876 
68.486 
67.467 
66.180 
68.192 
66.902 
65.901 
64.612 
71.535 
72.003 
71.701 
72.519 
72.208 
72.144 
74.016 



1.00 
J .00 
1.00 
_1.00 
1.00 
_1.00 
1.00 
1.00 
1.00 
1.00 
1.00 
_1.00 
1.00 
1 .00 
.00 
1.00 
1.00 
1.00 
1.00 
1 .00 
1. 00 
1.00 
1.00 
1.00 
1.00 
1.00 
1.00 
1.00 
1 .00 
1 .00 
.00 
.00 
1.00 
.00 
.00 



74.809 

74.371 

75.738 

75.745 

75.239 

74.197 

75.971 

76.31 1 

77.469 

75.478 

75.860 

75.232 

75.041 



1.00 
1 .00 
.00 
1.00 
_1.00 
1.00 
1.00 
100 
1.00 
1.00 
1.00 
1.00 
1.00 



63.06 
60.11 
58.19 
61.52 
60.05 
60.47 
62.98 
59.07 
61.15 
59.41 
61.27 
61.66 
62.08 
58.99 
60.53 
62.94 
63.01 
63.43 
64.75 
58.42 
62.63 
65.16 
57.96 
61.30 
61.22 
61.36 
59.50 
62.78 
61.96 
61.04 
60.29 
61.03 
57.40 
59.00 
59.80 
60.59 
59.36 
63.02 
59.32 
61.45 
60.60 
64.82 
62.76 
60.59 
59.11 
59.91 
61.75 
61.17 
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1958 


N 


ASN 


768 


44.532 


-7.846 


74.887 


1.00 


62.80 


1959 


CA 


ASN 


768 


44.942 


-9.109 


74.276 


1.00 


62.46 


1960 


CB 


ASN 


768 


43.717 


-10.032 


74.246 


1.00 


59.72 


1961 


CG 


ASN 


768 


42.798 


-9.832 


75.467 


1.00 


60.42 


1962 


OD1 


ASN 


768 


41.697 


-10.404 


75.538 


1.00 


59.89 


1963 


ND2 


ASN 


768 


43.248 


-9.020 


76.427 


1.00 


63.45 


1964 


C 


ASN 


768 


45.543 


-8.940 


72.855 


1.00 


63.10 


1965 


O 


ASN 


768 


46.095 


-9.882 


72.282 


1.00 


62.80 


1966 


N 


ILE 


769 


45.418 


-7.744 


72.284 


1.00 


58.10 


1967 


CA 


ILE 


769 


45.984 


-7.464 


70.967 


1.00 


60.18 


1968 


CB 


ILE 


769 


45.006 


-6.713 


70.036 


1.00 


61.64 


1969 


CG2 


ILE 


769 


45.569 


-6.703 


68.614 


1.00 


59.97 


1970 


CG1 


ILE 


769 


43.623 


-7.361 


70.051 


1.00 


65.54 


1971 


CD1 


ILE 


769 


42.605 


-6.626 


69.175 


1.00 


65.14 


1972 


C 


ILE 


769 


47.192 


-6.544 


71.150 


1.00 


59.44 


1973 


O 


ILE 


769 


47.217 


-5.701 


72.058 


1.00 


61.97 


1974 


N 


LYS 


770 


48.175 


-6.692 


70.267 


1.00 


61.57 


1975 


CA 


LYS 


770 


49.391 


-5.890 


70.314 


1.00 


60.09 


1976 


CB 


LYS 


770 


50.601 


-6.778 


70.033 


1.00 


61.51 


1977 


CG 


LYS 


770 


51.961 


-6.172 


70.339 


1.00 


61.08 


1978 


CD 


LYS 


770 


53.041 


-7.224 


70.047 


1.00 


59.40 


1979 


CE 


LYS 


770 


54.344 


-6.982 


70.801 


1.00 


62.15 


1980 


NZ 


LYS 


770 


55.339 


-8.089 


70.604 


1.00 


63.19 


1981 


C 


LYS 


770 


49.333 


-4.776 


69.277 


1.00 


63.90 


1982 


O 


LYS 


770 


49.439 


-5.031 


68.071 


1.00 


62.08 


1983 


N 


LYS 


771 


49.161 


-3.545 


69.754 


1.00 


59.06 


1984 


CA 


LYS 


771 


49.103 


-2.376 


68.884 


1.00 


61.08 


1985 


CB 


LYS 


771 


48.386 


-1.212 


69.589 


1.00 


63.15 


1986 


CG 


LYS 


771 


49.188 


-0.525 


70.712 


1.00 


63.84 


1987 


CD 


LYS 


771 


48.443 


0.681 


71.308 


1.00 


63.12 


1988 


CE 


LYS 


771 


49.384 


1.588 


72.100 


1.00 


59.83 


1989 


NZ 


LYS 


771 


48.821 


2.970 


72.186 


1.00 


60.16 


1990 


C 


LYS 


771 


50.532 


-1.976 


68.550 


1.00 


60.42 


1991 


O 


LYS 


771 


51.276 


-1.561 


69.430 


1.00 


62.03 


1992 


N 


LEU 


772 


50.928 


-2.120 


67.290 


1.00 


60.80 


1993 


CA 


LEU 


772 


52.285 


-1.756 


66.890 


1.00 


58.97 


1994 


CB 


LEU 


772 


52.629 


-2.368 


65.533 


1.00 


62.77 


1995 


CG 


LEU 


772 


52.781 


-3.885 


65.492 


1.00 


63.82 


1996 


CD1 


LEU 


772 


52.780 


-4.346 


64.046 


1.00 


62.61 


1997 


CD2 


LEU 


772 


54.071 


-4.295 


66.203 


1.00 


63.11 


1998 


C 


LEU 


772 


52.405 


-0.243 


66.812 


1.00 


61.57 


1999 


O 


LEU 


772 


51.513 


0.428 


66.289 


1.00 


60.62 


2000 


N 


LEU 


773 


53.499 


0.296 


67.341 


1.00 


59.76 


2001 


CA 


LEU 


773 


53.704 


1.734 


67.317 


1.00 


60.00 


2002 


CB 


LEU 


773 


53.602 


2.320 


68.725 


1.00 


59.66 


2003 


CG 


LEU 


773 


52.221 


2.321 


69.380 


1.00 


62.80 


2004 


CD1 


LEU 


773 


52.290 


2.999 


70.729 


1.00 


61.23 


2005 


CD2 


LEU 


773 


51.233 


3.053 


68.490 


1.00 


65.68 
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2006 



2007 



2008 
2009 



2010 



2011 



2012 



N 



CA 



CB 



LEU 



LEU 



PHE 



PHE 



CG 



2013 



2014 



2015 



2016 



2017 



2018 



2019 



2020 



2021 



2022 



2023 
2024 



2025 
2026 
2027 
2028 
2029 
2030 
2031 
2032 
2033 
2034 
2035 
2036 
2037 
2038 
2039 
2040 
2041 
2042 
2043 
2044 
2045 
2046 
2047 
2048 
2049 
2050 
2051 
2052 
2053 



CD1 



CD2 



CE1 



CE2 



CZ 



N 



PHE 



PHE 



PHE 



PHE 



PHE 



PHE 



773 



773 



774 



774 



774 



55.051 



55.911 
55.219 



56.451 



774 



774 



774 



774 



PHE 



PHE 



PHE 



CA 



CB 



CG 
CD2 



ND1 



CE1 
NE2 



CA 

CB 

CG 

CD 

OE1 

NE2 



OXT 
r CB~ 
CG 
CD 
OE1 
OE2 



JSI 

CA 
JvJ 

CA 

CB 

CG 

CD 

OE1 



HIS 



HIS 



HIS 



HIS 



HIS 



HIS 
HIS 
HIS 
HIS 
HIS 
GLN 
GLN 
GLN 
GLN 
GLN 
GLN 
GLN 
GLN 
GLN 
GLN 
GLU 
GLU 
GLU 
GLU 
GLU 
GLU 
GLU 
GLU 
GLU 
GLU 
GLU 
GLU 
GLU 
GLU 
GLU 



774 



774 



774 



774 



775 



775 



775 
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37.536 



35.646 



34.943 



1.00 



1.00 



1.00 



1.00 



33.472 



35.502 



35.590 



35.882 



36.411 



36.300 



36.663 



35.848 



37.802 



36.162 



38.120 



37.296 



37.851 



38.234 
38.661 
40.047 
40.865 
40.073 
1 40.953 
39.110 
39.086 
37.958 
37.693 
36.222 
38.439 
38.858 
39.479 
37.974 
37.746 
39.018 
39.266 
39.838 
41.073 
I 41.741 
43.008 
43.127 
44.501 
42.200 
44.279 
45.183 
44.973 
42.670 
44.043 



1.00 



1.00 



1.00 



1.00 



1.00 



1.00 



1.00 



1.00 



1.00 



1.00 



1.00 
1.00 



1.00 



1.00 
1.00 



60.68 



61.48 



60.91 



58.39 



62.30 



63.55 



63.47 



60.63 



60.75 



62.24 



65.41 



64.45 



62.88 



61.45 



61.89 
63.04 



66.17 



1.00 
1.00 
1.00 
1.00 
1.00 
1.00 
1.00 
1.00 
1.00 
_1.00 
1.00 
1.00 
1.00 
1.00 
1.00 
1.00 
1.00 
1.00 
1.00 
1.00 
1.00 
1.00 
1.00 
1.00 
1.00 
1.00 
1.00 
1.00 
1.00 



64.34 
61.04 
59.38 
61.90 
58.55 
58.23 
60.68 
61.24 
60.02 
62.86 
60.90 
62.46 
60.02 
61.60 
62.78 
63.89 
60.11 
61.05 
62.95 
62.51 
63.33 
59.08 
60.29 
61.56 
59.81 
57.77 
61.04 
63.24 
58.16 
62.09 
63.07 
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27.228 


24.528 


42.498 
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61 .48 


2831 
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62.30 
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2834 
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33.712 
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22.467 
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60.03 
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33.664 


21 .403 


40.624 


1.00 


62.69 


2837 
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33.312 


21.451 


39.141 


1.00 


60.54 


2838 
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21 .038 
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61.90 
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21.542 ! 


40.815 
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60.18 


2840 


O 


SER 


612 


35.842 


20.597 


41 .209 


1.00 
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63.42 
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59.89 
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2875 
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2877 
2878 
2879 
2880 
2881 
2882 
2883 
2884 
2885 
2886 
2887 
2888 
2889 
2890 
2891 
2892 
2893 
2894 
2895 
2896 
2897 
2898 
2899 
2900 
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2904 
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2910 
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_N 
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CB 
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CB 
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SER 
SER 
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SER 
SER 
SER 
SER 
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SER 
SER 
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ALA 
ALA 
ALA 
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ALA 
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ASN 
ASN 
ASN 
ASN 
ASN 
ASN 
LEU 
LEU 
LEU 
LEU 
LEU 
LEU 
LEU 
LEU 
LEU 
LEU 
LEU 
LEU 
LEU 
LEU 
LEU 
LEU 
CYS 
CYS 
CYS 
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615 
615 
615 
616 
616 
616 
616 
616 
616 
617 
617 
617 
617 
617 
617 
618 
618 
618 
618 
618 
619 
619 
619 
619 
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619 
619 
619 
620 
620 
620 
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"31.787 
36.547 
36.984 
36.726 
_37.4Q8 
36.380 
35.731 
38.347 
38.444 
,39.021 
39.972 
41.253 
41.690 
39.433 
40.099 
38.230 
37.600 
38.399 
37.475 
37.175 
.37.692 
37.610 
38.467 
39.881 
40.813 
.40.047 
36.205 
35.933 
35.305 
33.925 
33.599 
34.516 
33.992 
34.578 
33.031 
33.520 
31.728 



30.757 

29.822 

30.365 

29.302 

30.776 

30.001 

29.267 

30.191 

29.562 

30.612 

32.249 



18.204 
18.523 
17.588 
18.615 
17.559 
16.632 
17.299 
18.047 
17.424 
19.163 
19.681 
18.847 
18.714 
19.675 
19.196 
20.213 
20.261 
21.165 
18.866 
18.725 
17.836 
16.465 
15.523 
15.426 
15.986_ 
14.729 
15.922 
15.005 
16.487 
16.044 
15.266 
14.087 
13.354 
13.145 
17.266 
18.400 
17.022 
18.096 
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18.564 
18.516 
19.998 
17.973 
17.009 
18.952 
18.902 
19.216 
18.417 



44.928 
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45.093 
43.109 
42.387 
41.757 
I 40.688 
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40.246 
41.534 
40.560 
40.638 
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39.119 
38.196 
38.931 
37.612 
36.676 
37.005 
35.820 
37.820 
37.330 
38.178 
37.663 
38.241 
36.547 
37.241 
36.469 
38.028 
37.999 
39.271 
39.589 
40.805 
38.408 
37.890 
37.844 
37.808 
37.739 



36.545 
35.272 
34.204 
35.547 
39.033 
39.249 
39.903 
41.201 
42.276 
42.005 
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1.00 
_1.00 
1.00 
_1.00 
1.00 
1.00 
1.00 
.1.00 
1.00 
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1.00 
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1.00 
1.00 
1.00 
1.00 
1.00 
1.00 
1.00 
1.00 
1.00 
1.00 
1.00 
1.00 
1.00 
1.00 
1.00 
1.00 
1.00 
.1.00 
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1.00 
1.00 
1.00 
1.00 
1.00 
1.00 
1.00 
1.00 
1.00 
1.00 
1.00 
1.00 



63.88 
61.87 
61.19 
62.69 
60.64 
64.46 
63.47 
59.36 
61.79 
62.74 
60.01 
56.86 
64.58 
62.31 
60.08 
62.60 
62.84 
61.64 
60.94 
63.12 
60.02 
60.02 
61.65 
65.19 
58.62 
59.93 
63.97 
61.24 
61.46 
61.85 
63.36 
60.08 
65.17 
59.21 
63.79 
62.56 
60.88 
60.93 
59.84 
61.52 
61.79 
62.82 
59.84 
59.39 
60.82 
61.70 
61.70 
55.72 
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61.58 
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2954 


CG 


LEU 


627 
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ARG 
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533 ; 
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31.381 

30.879 

31.011 

29.978 

29.732 

30.936 

31.704 

31.109 

33.034 

28.695 

28.055 

28.334 

27.125 

26.821 

26.235 

25.223 

24.486 

24.739 

CO.fl I 

24.014 ! 
27.151 ! 
26.086 ! 
28.337 < 
28.465 " 


7.986 

8.014 

7.126 

6.151 

5.285 

4.415 

3.828 

3.357 

3.847 

6.851 

6.417 

7.946 

8.682 

9.786 

9.274 

10.258 

9.732 

10.038 

10.878 

9.501 

3.274 ; 

9.401 : 

9.643 : 

10.180 : 


34.254 

33.132 

35.186 

34.916 

36.159 

36.579 

35.393 

34.420 

35.480 

34.466 

33.512 

35.134 

34.767 

35.775 

37.069 

37.602 

38.743 

40.011 

40.308 

40.984 

33.360 

32.750 

32.855 

31.497 • 


1.00 

1.00 

1.00 

1.00 

1.00 

1.00 

1.00 

1.00 

1.00 

1.00 

1.00 

1.00 

1.00 

1.00 

1.00 

1.00 

1.00 

1.00 

1.00 

1.00 I 
1.00 < 
1.00 ! 
1.00 ( 
1.00 ( 


61.07 

61.02 

63.77 

61.66 

61.41 

65.26 

64 37 

60.73 

59.46 

61.12 

62.13 

59.27 

61.29 

59.65 

62.42 

60.12 

61.79 

64.89 

60.64 

52.92 

51.86 

59.63 

54.70 

50.91 
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3038 


CB 


PRO 


637 


28.363 


1 1 .932 


23.490 


1.00 


63.07 


3039 


CG 


PRO 


637 


27.855 


10.482 


23.398 


1.00 


62.95 


3040 


C 


PRO 


637 


26.780 


13.647 


24.551 


1.00 


61.80 


3041 


O 


PRO 


637 


27.038 


14.531 


23.736 


1.00 


60.33 


3042 


N 


CYS 


638 


25.605 


13.523 


25.193 


1.00 


63.35 


3043 


CA 


CYS 


638 


24.557 


14.549 


25.225 


1.00 


58.89 | 


3044 


CB 


CYS 


638 


23.122 


14.023 


25.351 


1.00 


62.48 


3045 


SG 


CYS 


638 


22.633 


12.668 


24.333 


1.00 


61.93 


3046 


C 


CYS 


638 


24.925 


14.896 


26.642 


1.00 


59.34 


3047 


O 


CYS 


638 


25.366 


16.010 


26.968 


1.00 


61.64 


3048 


N 


MET 


639 


24.773 


13.878 


27.486 


1.00 


59.95 


3049 


CA 


MET 


639 


25.094 


14.058 


28.870 


1.00 


62.51 


3050 


CB 


MET 


639 


24.794 


12.794 


29.647 


1.00 


56.25" 


3051 


CG 


MET 


639 


24.597 


13.021 


31.126 


1.00 


59.11 


3052 


SD 


MET 


639 


23.446 


14.225 


31.808 


1.00 


60.73 


3053 


CE 


MET 


639 


24.286 


14.281 


33.307 


1.00 


59.61 


3054 


C 


MET 


639 


26.567 


14.451 


28.934 


1.00 


60.92 


3055 


O 


MET 


639 


27.074 


14.783 


30.000 


1.00 


59.57 


3056 


N 


TYR 


640 


27.244 


14.409 


27.782 


1.00 


60.93 


3057 


CA 


TYR 


640 


28.622 


14.866 


27.693 


1.00 


62.32" 


3058 


CB 


TYR 


640 


29.585 


13.827 


27.110 


1.00 


61.41 


3059 


CG 


TYR 


640 


30.961 


14.446 


26.910 


1.00 


62.45 


3060 


CD1 


TYR 


640 


31.797 


14.687 


28.006 


1.00 


59.75 


3061 


CE1 


TYR 


640 


32.996 


15.372 


27.862 


1.00 


59.57 



SDOCID: <WO__03015692A2_I_> 



WO 03/015692 



PCT/US02/22648 



3062 



3063 



3064 



3065 



3066 
3067 



CD2 



CE2 



CZ 



OH 



3068 
3069 
3070 
3071 
3072 
3073 
3074 
3075 
3076 
3077 
3078 
3079 
3080 
3081 
3082 
3083 
3084 
3085 
3086 
3087 
3088 
3089 
3090 
3091 
3092 
3093 
3094 
3095 
3096 
3097 
3098 
3099 
3100 
3101 
3102 
3103 
3104 
3105 
3106 
3107 
3108 
3109 



TYR 



TYR 



TYR 



TYR 



N 

CA 

CB 

CG 

OD1 

OD2 



O 

_N 

CA 

CB 

CG 

CD 

OE1 

NE2 



N 

CA 
CB 
SG 



N 

CA 
CB 
CG 
CD 
CE 
NZ 



_N 

CA 
CB 
CG 
CD2 
ND1 
CE1 
NE2 
C 



TYR 



TYR 
ASP 
ASP 
ASP 
ASP 
ASP 
ASP 
ASP 
ASP 
GLN 
GLN 
GLN 
GLN 
GLN 
GLN 
GLN 
GLN 
GLN 
CYS 
CYS 
CYS 
CYS 
CYS 
CYS 
LYS 
LYS 
LYS 
LYS 
LYS 
LYS 
LYS 
LYS 
LYS 
HIS 
HIS 
HIS 
HIS 
HIS 
HIS 
HIS 
HIS 
HIS 
HIS 



640 



640 



640 



640 



640 



640 
641 
641 
641 
641 
641 
641 
641 
641 
642 
642 
642 
642 
642 
642 
642 
642 
642 
643 
643 
643 
643 
643 
643 
644 
644 
644 
644 
644 
644 
644 
644 
644 
645 
645 
645 
645 
645 
645 
645 
645 
645 
645 



-179- 



31.376 



32.578 



33.381 



34.554 



28.650 



_29.264 
_27.985 
27.946 
_26.821 
27.232 
26.317 
28.453 
27.729 
28.073 
27.124 
26.801 
25.298 
_24.570 
24.905 
24.656 
25.462 
27.353 
27.430 
27.678 
28.291 
28.348 
27.004 
29.691 
30.377 
30.093 
31.415 
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33.034 
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18.994 
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24.617 
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27.115 
27.180 
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25.345 
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28.518 
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32.130 
30.015 
1 30.836 
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26.889 
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1.00 
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J .00 
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_1.00 
1.00 
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1.00 
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1.00 
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,.1.00 
1 .00 
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27.490 
26.009 
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25.123 
23.892 
23.94 8 
29.652 
29.804 



.1.00 
1.00 
1.00 
1.00 
1.00 
1.00 
1.00 
1.00 



59.20 



64.39 



62.16 



62.68 



60.42 
62.43 
63.11 
64.46 
64.30 
61.31 
59.56 
63.20 
61.02 
60.07 
63.30 
60.83 
63.77 
59.93 
61.47 
59.50 
60.78 
60.90 
60.17 
61.78 
62.44 
66.84 
64.90 
61.39 
61.37 
63.75 
59.22 
57.84 
63.56 
63.09 
63.23 
61.68 
61.63 
62.21 
59.46 
61.40 
62.01 
61.11 
62.23 
62.89 
60.35 
58.72 
60.22 
62.73 
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46.052 


1.00 


58.24 


3239 


CB 


GLU 


661 


38.598 


37.523 


46.765 


1.00 


64.85 


3240 


CG 


GLU 


661 


38.276 


37.577 


48.225 


1.00 


61.43 


3241 


CD 


GLU 


661 


39.283 


38.360 


48.991 


1.00 


60.83 


3242 


OE1 


GLU 


661 


39.961 


39.204 


48.365 


1.00 


59.88 


3243 


OE2 


GLU 


661 


39.387 


38.143 


50.219 


1.00 


59.48 


3244 


C 


GLU 


661 


37.548 


36.918 


44.591 


1.00 


58.50 


3245 


O 


GLU 


661 


36.573 


37.536 


44.178 


1.00 


62.37 


3246 


N 


GLU 


662 


38.529 


36.516 


43.798 


1.00 


63.69 


3247 


CA 


GLU 


662 


38.453 


36.784 


42.374 


1.00 


59.27 


3248 


CB 


GLU 


662 


39.646 


36.158 


41.657 


1.00 


57.91 


3249 


CG 


GLU 


662 


40.974 


36.663 


42.122 


1.00 


62.43 


3250 


CD 


GLU 


662 


42.104 


35.948 


41.436 


1.00 


62.15 


3251 


OE1 


GLU 


662 


41 .969 


34.732 


41.223 


1.00 


57.52 


3252 


OE2 


GLU 


662 


43.128 


36.585 


41.119 


1.00 


61.99 


3253 


C 


GLU 


662 


37.138 


36.188 


41.831 


1.00 


61.83 
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3254 


O 


GLU 


662 


36.492 


36.771 


40.963 


1 00 

■ .WV 


63 75 


3255 


N 


TYR 


663 


36.751 


35.031 


42.361 


1 00 


59 76 


3256 


CA 


TYR 


663 


35.525 


34.336 


41.962 


1 00 


62 22 


3257 


CB 


TYR 


663 


35.439 


32.992 


42.694 


1 00 


58 01 


3258 


CG 


TYR 


663 


34.073 


32.342 


42.676 


1.00 


62 53 


3259 


CD1 


TYR 


663 


33.536 


31.831 


41.499 


1 00 


60 26 


3260 


CE1 


TYR 


663 


32.298 


31.201 


41.495 


1.00 


62 40 


3261 


CD2 


TYR 


663 


33.330 


32.212 


43.850 


1 00 


58 63 


3262 


CE2 


TYR 


663 


32.096 


31.590 


43.855 


1.00 


62 25 


3263 


CZ 


TYR 


663 


31.587 


31.084 


42.676 


1 00 


63 25 


3264 


OH 


TYR 


663 


30.372 


30.448 


42.682 


1 00 


62 04 


3265 


C 


TYR 


663 


34.240 


35.125 


42.228 


1.00 


61 08 


3266 


O 


TYR 


663 


33.429 


35.343 


41.322 


1.00 


58 41 


3267 


N 


LEU 


664 


34.055 


35.528 


43.480 


1 00 


60 5Q 


3268 


CA 


LEU 


664 


32.876 


36.270 


43.884 


1 00 


61 06 


3269 


CB 


LEU 


664 


32.976 


36.618 


45.369 


1 00 

1 m\J\J 


63 Qfi 


3270 


CG 


LEU 


664 


33.063 


35.440 


46.343 


1 00 


63 ft1 


3271 


CD1 


LEU 


664 


33.322 


35.929 


47.750 


1 00 


60 7Q 


3272 


CD2 


LEU 


664 


31.786 


34.656 


46.283 


1.00 


58 66 


3273 


C 


LEU 


664 


32.692 


37.539 


43.057 


1 00 


62 83 


3274 


O 


LEU 


664 


31.558 


37.955 


42.812 


1 00 


5Q 8R 


3275 


N 


CYS 


665 


33.809 


38.139 


42.632 


1 00 


5Q Q7 


3276 


CA 


CYS 


665 


33.805 


39.365 


41.831 


1 00 


63 5Q 


3277 


CB 


CYS 


665 


35.167 


40.043 


41.869 


1.00 


60 16 


3278 


SG 


CYS 


665 


35.586 


40.757 


43.441 


1 00 


62 Q4 


3279 


C 


CYS 


665 


33.475 


39.091 


40.388 


1.00 


60 01 


3280 


O 


CYS 


665 


32.794 


39.876 


39.735 


1.00 


57 49 


3281 


N 


MET 


666 


33.997 


37.984 


39.883 


1.00 


60 45 


3282 


CA 


MET 


666 


33.752 


37.601 


38.510 


1.00 


61 35 


3283 


CB 


MET 


666 


34.733 


36.517 


38.077 


1.00 


60 05 


3284 


CG 


MET 


666 


36.156 


36.993 


37.902 


1.00 


63 77 


3285 


SD 


MET 


666 


37.274 


35.592 


37.856 


1.00 


59.75 


3286 


CE 


MET 


666 


37.139 


35.071 


36.150 


1.00 


62.17 


3287 


C 


MET 


666 


32.338 


37.078 


38.41 1 


1.00 


61.52 


3288 


O 


MET 


666 


31.681 


37.255 


37.388 


1.00 


60.47 


3289 


N 


LYS 


667 


31.869 


36.433 


39.475 


1.00 


61.04 


3290 


CA 


LYS 


667 


30.516 


35.898 


39.482 


1.00 | 


62.69 


3291 


CB 


LYS 


667 


30.261 


35.036 


40.726 


1.00 


61.46 


3292 


CG 


LYS 


667 


28.966 


34.228 


40.671 


1.00 


59.84 


3293 


CD 


LYS 


667 


28.678 


33.497 


41.975 


1.00 


63.25 


3294 


CE 


LYS 


667 


28.483 


34.471 


43.123 


1.00 


59.23 


3295 


NZ 


LYS 


667 


27.639 


33.891 


44.192 


1.00 


61.97 


3296 


C 


LYS 


667 


29.554 


37.066 


39.461 


1.00 


63.05 


3297 


O 


LYS 


667 


28.459 


36.945 


38.942 


1.00 


60.65 


3298 


N 


THR 


668 


29.981 


38.201 


40.011 


1.00 


61.58 


3299 


CA 


THR 


668 


29.146 


39.398 


40.061 


1.00 


60.37 


3300 


CB 


THR 


668 


29.609 


40.357 


41.149 


1.00 


59.81 


3301 


OG1 


THR 


668 


29.776 


39.634 


42.370 


1.00 


59.26 
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3302 


CG2 


THR 


668 


28.588 


41 .442 


41.365 


1.00 


57.52 


3303 


C 


THR 


668 


29.174 


40.146 


38.746 


1.00 


60.69 


3304 


O 


THR 


668 


28.184 


40.749 


38.348 


1.00 


61.79 


3305 


N 


LEU 


669 


30.320 


40.111 


38.076 


1.00 


58.82 


3306 


CA 


LEU 


669 


30.479 


40.774 


36.786 


1.00 


60 62 


3307 


CB 


LEU 


669 


31.947 


40.863 


36.412 


1.00 


60.70 | 


3308 


CG 


LEU 


669 


32.673 


41 .944 


37.192 


1.00 


61.83 


3309 


CD1 


LEU 


669 


34.131 


41.996 


36.761 


1.00 


63.30 


3310 


CD2 


LEU 


669 


31.981 


43.275 


36.953 


1.00 


62.82 


3311 


C 


LEU 


669 


29.736 


40.028 


35.707 


1.00 


63.57 


3312 


O 


LEU 


669 


29.574 


40.521 


34.599 


1.00 


63.32 


3313 


N 


LEU 


670 


29.303 


38.823 


36.034 


1.00 


62.04 


3314 


CA 


LEU 


670 


28.558 


38.030 


35.087 


1.00 


64.79 


3315 


CB 


LEU 


670 


28.662 


36.542 


35.432 


1.00 


63.41 


3316 


CG 


LEU 


670 


29.983 


35.838 


35.078 


1.00 


62.66 


3317 


CD1 


LEU 


670 


29.918 


34.407 


35.554 


1.00 


61.38 


3318 


CD2 


LEU 


670 


30.239 


35.867 


33.580 


1.00 


61.74 


3319 


C 


LEU 


670 


27.111 


38.495 


35.114 


1.00 


60.65 


3320 


O 


LEU 


670 


26.405 


38.402 


34.119 


1.00 


60.67 


3321 


N 


LEU 


671 


26.673 


39.008 


36.257 


1.00 


60.80 


3322 


CA 


LEU 


671 


25.308 


39.500 


36.386 


1.00 


60.13 


3323 


CB 


LEU 


671 


24.977 


39.756 


37.852 


1.00 


55.83 


3324 


CG 


LEU 


671 


23.636 


40.403 


38.198 


1.00 


59.73 


3325 


CD1 


LEU 


671 


22.495 


39.498 


37.819 


1.00 


58.32 


3326 


CD2 


LEU 


671 


23.606 


40.677 


39.673 


1.00 


59.37 


3327 


C 


LEU 


671 


25.178 


40.804 


35.613 


1.00 


60.50 


3328 


O 


LEU 


671 


24.076 


41.295 


35.377 


1.00 


61.24 


3329 


N 


LEU 


672 


26.320 


41.354 


35.219 


1.00 


59.65 


3330 


CA 


LEU 


672 


26.355 


42.613 I 


34.492 


1.00 


62.17 


3331 


CB 


LEU 


672 


27.128 


43.650 


35.309 


1.00 


60.10 


3332 


CG 


LEU 


672 


26.917 


43.688 


36.822 


1.00 


65.92 


3333 


CD1 


LEU 


672 


27.728 


44.819 


37.407 


1.00 


60.54 


3334 


CD2 


LEU 


672 


25.460 


43.885 


37.148 


1.00 


58.59 


3335 


C 


LEU 


672 


27.027 


42.456 


33.131 


1.00 


62.29 


3336 


O 


LEU 


672 


27.489 


43.430 


32.554 


1.00 


61.76 


3337 


N 


SER 


673 


27.070 


41.237 


32.613 


1.00 


60.43 


3338 


CA 


SER 


673 


27.732 


40.980 


31.342 


1.00 


60.60 


3339 


CB 


SER 


673 


28.212 


39.538 


31.317 


1.00 


60.81 


3340 


OG 


SER 


673 


27.281 


38.718 | 


31 .987 


1.00 


56.49 


3341 


C 


SER 


673 


26.949 


41 .280 


30.074 


1.00 


60.41 


3342 


O 


SER 


673 


27.542 


41 .502 


29.020 


1.00 


60.96 


3343 


N 


SER 


674 


25.625 


41.267 


30.160 


1.00 


62.26 


3344 


CA 


SER 


674 


24.800 


41.565 


28.995 


1.00 


62.20 


3345 


CB 


SER 


674 


24.359 


40.281 


28.298 


1.00 


61.34 


3346 


OG 


SER 


674 


23.730 


39.420 


29.221 


1.00 


62.88 


3347 


C 


SER 


674 


23.581 


42.371 


29.402 


1.00 


60.51 


3348 


O 


SER 


674 


22.904 


42.050 


30.376 


1.00 


62.02 


3349 


N 


VAL 


675 


23.321 43.432 


28.653 


1.00 


61.79 
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3350 


CA 


VAL 


675 


22.190 


44.299 


28.911 


1.00 


61.63 


3351 


CB 


VAL 


675 


22.670 


45.712 


29.194 


1.00 


59.56 


3352 


CG1 


VAL 


675 


23.388 


45.748 


30.517 


1.00 


62 21 


3353 


CG2 


VAL 


675 


23.598 


46.164 


28.078 


1.00 


63.23 


3354 


C 


VAL 


675 


21.325 


44.320 


27.658 


1.00 


62.05 


3355 


O 


VAL 


675 


21.757 


43.861 


26.603 


1.00 


62 28 


3356 


N 


PRO 


676 


20.077 


44.817 


27.764 


1.00 


60.51 


3357 


CD 


PRO 


676 


19.330 


45.132 


28.991 


1.00 


63.50 


3358 


CA 


PRO 


676 


19.191 


44.880 


26.593 


1.00 


62.22 


3359 


CB 


PRO 


676 


1 7.896 


45.494 


27. 1 56 


1.00 


58.53 


3360 


CG 


PRO 


676 


18.322 


46.117 


28.488 


1.00 


60.68 


3361 


C 


PRO 


676 


19.839 


45.746 


25.514 


1.00 


58.14 


3362 


O 


PRO 


676 


20.824 


46.435 


25.792 


1.00 


61.96 


3363 


N 


LYS 


677 


19.309 


45.710 


24.293 


1.00 


60.64 


3364 


CA 


LYS 


677 


19.906 


46.501 


23.215 


1.00 


59.47 


3365 


CB 


LYS 


677 


19.025 


46.521 


21.970 


1.00 


61.69 


3366 


CG 


LYS 


677 


1 9.782 


46.912 


20.707 


1.00 


61.32 


3367 


CD 


LYS 


677 


18.832 


47.051 


19.514 


1.00 


61.48 


3368 


CE 


LYS 


677 


19.604 


47.129 


18.198 


1.00 


62.13 


3369 


NZ 


LYS 


677 


20.435 


45.908 


1 7.952 


1.00 


60.25 


3370 


C 


LYS 


677 


20.145 


47.929 


23.686 


1.00 


59.29 


3371 


O 


LYS 


677 


21.248 


48.235 


24.158 


1.00 


63.94 


3372 


N 


ASP 


678 


19.129 


48.796 


23.580 


1.00 


61.98 


3373 


CA 


ASP 


678 


1 9.302 


50.178 


24.028 


1.00 


61.19 


3374 


CB 


ASP 


678 


18.178 


51 .083 


23.506 


1.00 


60.11 


3375 


CG 


ASP 


678 


18.515 


52.582 


23.647 


1.00 


60.40 


3376 


OD1 


ASP 


678 


18.311 


53.325 


22.652 


1.00 


60.21 


3377 


OD2 


ASP 


678 


18.980 


53.01 1 


24.745 


1.00 


61.19 


3378 


C 


ASP 


678 


19.395 


50.284 


25.558 


1.00 


61.02 


3379 


O 


ASP 


678 


18.592 


50.955 


26.210 


1.00 


61.94 


3380 


N 


GLY 


679 


20.398 


49.604 


26.108 


1.00 


58.42 


3381 


CA 


GLY 


679 


20.649 


49.605 


27.534 


1.00 


56.97 


3382 


C 


GLY 


679 


19.449 


49.444 


28.438 


1.00 


56.85 


3383 


O 


GLY 


679 


18.362 


49.031 


28.028 


1.00 


59.44 


3384 


N 


LEU 


680 


19.674 


49.788 


29.696 


1.00 


63.29 


3385 


CA 


LEU 


680 


18.655 


49.704 


30.727 


1.00 


62.45 


3386 


CB 


LEU 


680 


19.297 


49.181 


32.017 


1.00 


64.86 


3387 


CG 


LEU 


680 


20.118 


47.895 


31.832 


1.00 


59.84 


3388 


CD1 


LEU 


680 


20.946 


47.595 


33.068 


1.00 


59.62 


3389 


CD2 


LEU 


680 


19.181 


46.760 


31.543 


1.00 


61.01 


3390 


C 


LEU 


680 


18.056 


51 .090 


30.955 


1.00 


60.58 


3391 


O 


LEU 


680 


18.433 


52.063 


30.298 


1.00 


62.60 


3392 


N 


LYS 


681 


17.120 


51.174 


31.888 


1.00 


62.27 


3393 


CA 


LYS 


681 


16.486 


52.441 


32.197 


1.00 


63.23 


3394 


CB 


LYS 


681 


15.146 


52.211 


32.901 


1.00 


59.95 


3395 


CG 


LYS 


681 


14.188 


51.417 


32.034 


1.00 


60.87 


3396 


CD 


LYS 


681 


12.837 


51.183 


32.665 


1.00 


61.02 


3397 


CE 


LYS 


681 


12.004 


50.295 


31.740 


1.00 


64.07 
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3398 


NZ 


LYS 


681 


10.616 


50.060 


32.218 


1.00 


58.77 


3399 


C 


LYS 


681 


17.414 


53.242 


33.072 


1.00 


60.18 


3400 


O 


LYS 


681 


17.373 


54.462 


33.069 


1.00 


63.10 


3401 


N 


SER 


682 


18.278 


52.554 


33.802 


1.00 


60.92 


3402 


CA 


SER 


682 


19.214 


53.240 


34.681 


1.00 


62.50 


3403 


CB 


SER 


682 


18.953 


52.786 


36.113 


1.00 


62.86 


3404 


OG 


SER 


682 


17.564 


52.589 


36.296 


1.00 


61.69 


3405 


C 


SER 


682 


20.682 


52.993 


34.272 


1.00 


58.65 


3406 


O 


SER 


682 


21.558 


52.781 


35.120 


1.00 


60.76 


3407 


N 


GLN 


683 


20.924 


53.053 


32.961 


1.00 


60.04 


3408 


CA 


GLN 


683 


22.241 


52.840 


32.348 


1.00 


61.19 


3409 


CB 


GLN 


683 


22.156 


53.127 


30.850 


1.00 


59.58 


3410 


CG 


GLN 


683 


23.397 


52.757 


30.056 


1.00 


62.33 


3411 


CD 


GLN 


683 


23.606 


51.259 


29.955 


1.00 


62.33 


3412 


OE1 


GLN 


683 


22.651 


50.502 


29.759 


1.00 


62.13 


3413 


NE2 


GLN 


683 


24.858 


50.823 


30.065 


1.00 


61.30 


3414 


C 


GLN 


683 


23.397 


53.655 


32.934 


1.00 


61.70 


3415 


O 


GLN 


683 


24.561 


53.335 


32.719 


1.00 


62.18 


3416 


N 


GLU 


684 


23.083 


54.710 


33.666 


1.00 


64.02 


3417 


CA 


GLU 


684 


24.117 


55.539 


34.257 


1.00 


60.92 


3418 


CB 


GLU 


684 


23.541 


56.904 


34.590 


1.00 


62.70 


3419 


CG 


GLU 


684 


22.396 


56.780 


35.574 


1.00 


62.30 


3420 


CD 


GLU 


684 


21 .884 


58.112 


36.063 


1.00 


61.33 


3421 


OE1 


GLU 


684 


21.260 


58.120 


37.153 


1.00 


61.85 


3422 


OE2 


GLU 


684 


22.092 


59.135 


35.363 


1.00 


62.66 


3423 


C 


GLU 


684 


24.582 


54.867 


35.534 


1.00 


61.21 


3424 


O 


GLU 


684 


25.741 


54.979 


35.924 


1.00 


63.65 


3425 


N 


LEU 


685 


23.659 


54.181 


36.197 


1.00 


62.46 


3426 


CA 


LEU 


685 


23.992 


53.487 


37.429 


1.00 


60.48 


3427 


CB 


LEU 


685 


22.731 


53.265 


38.269 


1.00 


62.31 


3428 


CG 


LEU 


685 


22.992 


53.036 


39.764 


1.00 


59.19 


3429 


CD1 


LEU 


685 


23.700 


54.245 


40.360 


1.00 


59.86 


3430 


CD2 


LEU 


685 


21 .684 


52.795 


40.485 


1.00 


61.02 


3431 


C 


LEU 


685 


24.657 


52.148 


37.086 


1.00 


61.95 


3432 


O 


LEU 


685 


25.524 


51.662 


37.804 


1.00 


61.31 


3433 


N 


PHE 


686 


24.264 


51.566 


35.964 


1.00 


60.07 


3434 


CA 


PHE 


686 


24.832 


50.302 


35.560 


1.00 


60.10 


3435 


CB 


PHE 


686 


24.147 


49.785 


34.311 


1.00 


61.65 


3436 


CG 


PHE 


686 


24.500 


48.372 
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THR 


739 


16.041 


20.437 


27.659 


1.00 


59.46 


3888 


CG2 


THR 


739 


17.234 


19.302 


25.902 


1.00 


67.45 


3889 


C 


THR 


739 


13.721 


20.740 


26.282 


1.00 


60.54 


3890 


O 


THR 


739 


12.911 


19.949 


26.758 


1.00 


60.32 


3891 


N 


PHE 


740 


13.589 


22.049 


26.433 


1.00 


60.26 


3892 


CA 


PHE 


740 


12.426 


22.572 


27.136 


1.00 


60.59 


3893 


CB 


PHE 


740 


12.645 


24.013 


27.586 


1.00 


60.53 


3894 


CG 


PHE 


740 


1 1 .387 


24.682 


28.073 


1.00 


60.33 


3895 


CD1 


PHE 


740 


10.976 


24.543 


29.399 


1.00 


60.24 


3896 


CD2 


PHE 


740 


10.591 


25.417 


27.196 


1.00 


60.88 


3897 


CE1 


PHE 


740 


9.794 


25.124 


29.842 


1.00 


62.19 


3898 


CE2 


PHE 


740 


9.407 


26.001 


27.629 


1.00 


63.06 


3899 


CZ 


PHE 


740 


9.005 


25.857 


28.954 


1.00 


59.45 


3900 


C 


PHE 


740 


1 1 .269 


22.562 


26.161 


1.00 


62.20 


3901 


O 


PHE 


740 


10.102 


22.514 


26.560 


1.00 


60.22 


3902 


N 


LEU 


741 


11.619 


22.631 


24.877 


1.00 


61.98 


3903 


CA 


LEU 


741 


10.650 


22.665 


23.783 


1.00 


61.09 


3904 


CB 


LEU 


741 


11.158 


23.561 


22.656 


1.00 


63.24 


3905 


CG 


LEU 


741 


11.286 


25.053 


22.919 


1.00 


57.80 


3906 


CD1 


LEU 


741 


11.680 


25.732 


21.617 


1.00 


59.82 


3907 


CD2 


LEU 


741 


9.966 


25.608 


23.455 


1.00 


59.42 


3908 


C 


LEU 


741 


10.313 


21.316 


23.170 


1.00 


60.22 


3909 


O 


LEU 


741 


9.748 


21.267 


22.079 


1.00 


59.13 


3910 


N 


ASP 


742 


10.662 


20.230 


23.845 


1.00 


61.37 


3911 


CA 


ASP 


742 


10.388 


18.914 


23.309 


1.00 


61.75 


3912 


CB 


ASP 


742 


11.679 


18.315 


22.733 


1.00 


62.29 


3913 


CG 


ASP 


742 


11.476 


16.916 


22.145 


1.00 


59.85 


3914 


OD1 


ASP 


742 


12.450 


16.354 


21.576 


1.00 


62.02 


3915 


OD2 


ASP 


742 


10.348 


16.378 


22.253 


1.00 


63.60 


3916 


C 


ASP 


742 


9.843 


18.041 


24.420 


1.00 


61.37 


3917 


O 


ASP 


742 


10.614 


17.412 


25.153 


1.00 


62.69 


3918 


N 


LYS 


743 


8.517 


18.018 


24.564 


1.00 


61.79 


3919 


CA 


LYS 


743 


7.882 


17.183 


25.595 


1.00 


58.96 


3920 


CB 


LYS 


743 


6.381 


17.501 


25.727 


1.00 


62.19 


3921 


CG 


LYS 


743 


6.056 


18.836 


26.404 


1.00 


57.47 


3922 


CD 


LYS 


743 


4.545 


19.047 


26.473 


1.00 


59.18 


3923 


CE 


LYS 


743 


4.180 


20.313 


27.232 


1.00 


61.36 


3924 


NZ 


LYS 


743 


2.699 


20.507 


27.295 


1.00 


57.73 


3925 


C 


LYS 


743 


8.055 


15.688 


25.281 


1.00 


61.20 
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CA 
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3954 
3955 
3956 
3957 
3958 
3959 
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3965 
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3969 
3970 
3971 
3972 
3973 



CG 
SD 
CE 



N 

CA 
CB 
OG 



N 

CA 

CB 

CG2 

CG1 

CD1 



_N 

CA 

CB 

CG 

CD 

OE1 

OE2 

C 

O 

_N 

CA 

CB 

CG 

CD1 
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CE2 

CZ 



LYS 



THR 
THR 



THR 



THR 



THR 



THR 



THR 



MET 



MET 



MET 
MET 
MET 
MET 
MET 
MET 
SER 
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SER 
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SER 
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LE 

LE 
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744 
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7.912 



8.366 



8.580 



8.792 



7.881 



8.550 



14.843 



15.380 



14.007 



13.974 



14.890 



9.818 



9.976 



10.711 



11.933 



12.982 
14.366 
15.440 
16.317 
11.435 
11.973 
10.394 
9.744 
9.149 
8.512 
10.597 
10.576 
11.322 
12.140 
13.557 
14.374 
14.282 
15.664 
11.441 
11.166 
11.167 
10.457 
9.803 
8.628 
7.998 
8.753 
6.744 
11.333 
12.503 
10.781 



12.574 



13.406 



12.195 



14.279 



13.887 



26.165 



24.020 



23.554 



22.047 



21.426 



21.513 



1.00 



1.00 



1.00 



1.00 



1.00 



24.202 



24.261 



24.646 



11.484 

1 1 .773 

12.801 

12.604 

13.948 

13.461 

14.821 

14.566 



14.976 
14.645 
15.948 
15.039 
13.843 
13.137 
14.635 
14.657 
13.249 
13.111 
15.089 
14.408 
16.207 
16.678 
17.115 
17.635 
15.911 
16.211 
17.829 
18.891 
17.597 
18.546 
17.736 
18.410 
17.505 
16.918 
17.396 
19.701 
19.498 
20.913 
22.079 
23.202 
22.827 
21.790 
23.639 
21.555 
23.409 
22.379 



25.334 



1.00 



1.00 



1.00 



1.00 



25.147 
25.612 
24.965 
23.610 
26.774 
27.629 
27.025 
28326 
28.576 
29.842 
29.533 
30.545 
29.443 
1 30.571 
30.098 
31.275 
29.484 
28.976 
31.318 
30.747 
32.601 
33.466 
34.620 
35.400 
36.516 
37.340 
36.574 
34.022 
34.367 
34.046 
34.601 
33.571 
32.506 
31.624 
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32.305 
30.545 
31.221 
30.341 



1.00 
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1.00 
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59.67 
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61.58 
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57.60 



62.60 



59.01 
60.84 
58.90 
62.07 
63.96 
64.77 
59.79 
60.83 
61.05 
61.76 
61.97 
61.88 
60.50 
61.46 
57.79 
62.83 
61.46 
65.11 
63.68 
' 62.48 " 
59.16 
62.34 
60.97 
63.60 
59.56 
61.60 
62.18 
59.89 
59.20 
62.20 
60.36 
63.43 
62.31 
63.18 
58.27 
59.47 
59.31 
60.26 
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3974 


C 


PHE 


749 


10.491 


22.634 


35.646 


1.00 


57 34 


3975 


O 


PHE 


749 


9.296 


22.348 


35.598 


1.00 


62 05 


3976 


N 


PRO 


750 


10.971 


23.425 


36.612 


1.00 


61.11 


3977 


CD 


PRO 


750 


12.322 


23.572 


37.178 


1.00 


62.36 


3978 


CA 


PRO 


750 


9.955 


23.918 


37.535 


1.00 


63.79 


3979 


CB 


PRO 


750 


10.745 


24.124 


38.834 


1.00 


59.84 


3980 


CG 


PRO 


750 


12.072 


24.510 


38.341 


1.00 


61.92 


3981 


C 


PRO 


750 


9.283 


25.174 


37.042 


1.00 


61.44 


3982 


O 


PRO 


750 


9.000 


25.325 


35.852 


1.00 


61.38 


3983 


N 


GLU 


751 


9.016 


26.087 


37.962 


1.00 


61.10 


3984 


CA 


GLU 


751 


8.375 


27.335 


37.604 


1.00 


59.57 


3985 


CB 


GLU 


751 


7.535 


27.828 


38.778 


1.00 


60.79 


3986 


CG 


GLU 


751 


6.534 


26.804 


39.190 


1.00 


60.87 


3987 


CD 


GLU 


751 


5.716 


26.358 


38.013 


1.00 


58.45 


3988 


OE1 


GLU 


751 


6.004 


26.825 


36.889 


1.00 


58.76 


3989 


OE2 


GLU 


751 


4.768 


25.564 


38.205 


1.00 


62.99 


3990 


C 


GLU 


751 


9.449 


28.328 


37.275 


1.00 


59.04 


3991 


O 


GLU j 


751 


9.644 


28.698 


36.115 


1.00 


63.48 


3992 


N 


MET 


752 


10.154 


28.736 


38.319 


1.00 


62.70 


3993 


CA 


MET 


752 


11.223 


29.693 


38.184 


1.00 


60.75 


3994 


CB 


MET 


752 


12.241 


29.499 


39.306 


1.00 


60.00 


3995 


CG 


MET 


752 


13.222 


30.641 


39.392 


1.00 


60.74 


3996 


SD 


MET 


752 


12.305 


32.203 


39.387 


1.00 


61.02 


3997 


CE 


MET 


752 


12.040 


32.394 


41.073 


1.00 


58.25 


3998 


C 


MET 


752 


11.919 


29.549 


36.847 


1.00 


61.43 


3999 


O 


MET 


752 


12.062 


30.515 


36.103 


1.00 


63.92 


4000 


N 


LEU 


753 


12.329 


28.326 


36.537 


1.00 


61.53 


4001 


CA 


LEU 


753 


13.044 


28.081 


35.307 


1.00 


63.94 


4002 


CB 


LEU 


753 


13.749 


26.729 


35.370 


1.00 


61.07 


4003 


CG 


LEU 


753 


15.278 


26.834 


35.432 


1.00 


59.22 


4004 


CD1 


LEU 


753 


15.720 


27.666 


36.636 


1.00 


60.65 


4005 


CD2 


LEU 


753 


15.870 


25.436 


35.488 


1.00 


56.88 


4006 


C 


LEU 


753 


12.182 


28.179 


34.073 


1.00 


62.24 


4007 


O 


LEU 


753 


12.539 


28.882 


33.138 


1.00 


61.75 


4008 


N 


ALA 


754 


11.049 


27.488 


34.061 


1.00 


58.32 


4009 


CA 


ALA 


754 


10.156 


27.536 


32.902 


1.00 


60.89 


4010 


CB 


ALA 


754 


8.856 


26.790 


33.209 


1.00 


62.87 


4011 


C 


ALA 


754 


9.851 


28.995 


32.581 


1.00 


60.33 


4012 


O 


ALA 


754 


10.031 


29.471 


31 .454 


1.00 


61.42 


4013 


N 


GLU 


755 


9.406 


29.698 


33.615 


1.00 


62.19 


4014 


CA 


GLU 


755 


9.040 


31.101 


33.535 


1.00 


60.87 


4015 


CB 


GLU 


755 


8.481 


31.550 


34.891 


1.00 


60.79 


4016 


CG 


GLU 


755 


7.821 


32.911 


34.858 


1.00 


62.93 


4017 


CD 


GLU 


755 


6.333 


32.829 


35.097 


1.00 


56.10 


4018 


OE1 


GLU 


755 


5.741 


31.746 


34.841 


1.00 


59.07 


4019 


OE2 


GLU 


755 


5.761 


33.857 


35.531 


1.00 


65.63 


4020 


C 


GLU 


755 


10.163 


32.053 


33.106 


1.00 


64.44 


4021 


O 


GLU 


755 


10.006 


337269 


33.209 


1.00 


61.31 



WO 03/015692 



PCT/US02/22648 




MSDOCID: <WO_030l 5692A2J_> 



WO 03/015692 



-200- 



PCT/US02/22648 



4070 


N 


PRO 


762 


13.128 


32.938 


22.626 


1.00 


60.85 


4071 


CD 


PRO 


762 


11.810 


32.652 


22.044 


1.00 


60.12 


4072 


CA 


PRO 


762 


13.999 


33.474 


21.571 


1.00 


60.84 


4073 


CB 


PRO 


762 


13.355 


32.980 


20.264 


1.00 


58.66 


4074 


CG 


PRO 


762 


12.240 


32.017 


20.716 


1.00 


58.17 


4075 


C 


PRO 


762 


14.222 


34.968 


21.533 


1.00 


62.93 


4076 


O 


PRO 


762 


14.168 


35.566 


20.457 


1.00 


59.88 


4077 


N 


LYS 


763 


14.405 


35.599 


22.687 


1.00 


59.58 


4078 


CA 


LYS 


763 


14.750 


37.015 


22.653 


1.00 


62.56 


4079 


CB 


LYS 


763 


14.713 


37.645 


24.045 


1.00 


61.84 


4080 


CG 


LYS 


763 


15.014 


39.141 


24.061 


1.00 


61.96 


4081 


CD 


LYS 


763 


14.703 


39.723 


25.430 


1.00 


62.42 


4082 


CE 


LYS 


763 


13.428 


39.096 


25.979 


1.00 


61.00 


4083 


N2 


LYS 


763 


12.992 


39.651 


27.285 


1.00 


62.85 


4084 


C 


LYS 


763 


16.182 


36.666 


22.292 


1.00 


58.43 


4085 


O 


LYS 


763 


16.780 


37.217 


21.354 


1.00 


59.83 


4086 


N 


TYR 


764 


16.668 


35.665 


23.036 


1.00 


59.46 


4087 


CA 


TYR 


764 


17.999 


35.106 


22.895 


1.00 


61.50 


4088 


CB 


TYR 


764 


18.291 


34.109 


24.020 


1.00 


62.35 


4089 


CG 


TYR 


764 


19.085 


34.715 


25.149 


1.00 


57.24 


4090 


CD1 


TYR 


764 


18.526 


34.872 


26.424 


1.00 


62.21 


4091 


CE1 


TYR 


764 


19.236 


35.509 


27.451 


1.00 


63.69 


4092 


CD2 


TYR 


764 


20.378 


35.200 


24.927 


1.00 


62.73 


4093 


CE2 


TYR 


764 


21.095 


35.837 


25.942 


1.00 


62.44 


4094 


CZ 


TYR 


764 


20.516 


35.992 


27.195 


1.00 


65.70 


4095 


OH 


TYR 


764 


21.206 


36.670 


28.168 


1.00 


63.44 


4096 


C 


TYR 


764 


18.156 


34.400 


21.570 


1.00 


60.32 


4097 


O 


TYR 


764 


17.580 


34.816 


20.556 


1.00 


60.80 


4098 


N 


SER 


765 


18.922 


33.310 


21.597 


1.00 


62.63 


4099 


CA 


SER 


765 


19.209 


32.535 


20.391 


1.00 


62.57 


4100 


CB 


SER 


765 


17.908 


31.971 


19.777 


1.00 


63.02 


4101 


OG 


SER 


765 


18.172 


31.232 


18.586 


1.00 


63.09 


4102 


C 


SER 


765 


19.904 


33.487 


19.403 


1.00 


61.20 


4103 


O 


SER 


765 


21.121 


33.703 


19.474 


1.00 


60.85 


4104 


N 


ASN 


766 


19.099 


34.064 


18.513 


1.00 


59.31 


4105 


CA 


ASN 


766 


19.520 


35.005 


17.477 


1.00 


61.38 


4106 


CB 


ASN 


766 


18.344 


35.932 


17.155 


1.00 


63.73 


4107 


CG 


ASN 


766 


17.006 


35.195 


17.116 


1.00 


60.04 


4108 


OD1 


ASN 


766 


16.493 


34.720 


18.153 


1.00 


64.69 


4109 


ND2 


ASN 


766 


16.433 


35.091 


15.916 


1.00 


59.73 


4110 


C 


ASN 


766 


20.764 


35.857 


17.800 


1.00 


64.20 


4111 


O 


ASN 


766 


21.906 


35.462 


17.491 


1.00 


60.72 


4112 


N 


GLY 


767 


20.523 


37.032 


18.396 


1.00 


59.36 


4113 


CA 


GLY 


767 


21.589 


37.961 


18.766 


1.00 


63.17 


4114 


C 


GLY 


767 


21 .096 


39.388 


19.032 


1.00 


61.77 


4115 


O 


GLY 


767 


21.905 


40.321 


19.172 


1.00 


58.54 


4116 


N 


ASN 


768 


19.772 


39.550 


19.118 


1.00 


61.70 


4117 


CA 


ASN 


768 


19.115 


40.849 


19.347 


1.00 


60.25 
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4166 


N 


PHE 


774 


35.101 


42.352 


27.712 


1 00 

1 .WW 


fin i^ 


4167 


CA 


PHE 


774 


36.246 


43.072 


28 248 

^ W ■ W 


1 00 
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U 1 . Jf 


4168 


CB 


PHE 


774 


36.893 


42.279 


29 377 

4L» W • W 9 f 


1 00 
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66 OQ 

ww.ww 


4169 


CG 


PHE 


774 


36.280 


42.543 


30 698 


1 00 


61 67 

w 1 - w f 


4170 


CD1 


PHE 


774 


36.524 


43.741 


31 355 

W 1 • W W W 


1 00 

1 • W w 


SQ 42 


4171 


CD2 


PHE 


774 


35.385 


41 .650 


31 248 

W 1 - ¥ W 


1 00 

1 .WW 


^6 29 

ww . £- w 


4172 


CE1 


PHE 


774 


35.879 


44.050 


32 536 


1 00 

1 .WW 


64 7Q 

W*T . # W 


4173 


CE2 
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9 


40.476 


0.604 


50 517 


1 on 


fi9 ft7 


4340 


O 


HOH 


10 


42.083 


33.017 


29 431 


1 00 

1 . UU 




4341 


O 


HOH 


11 


40.224 


-1 .905 


63 310 


1 00 

1 .UU 


fin ^r 

UU. JO 


4342 


O 


HOH 


12 


29.926 


49.219 


30.317 


1 00 

1 . UU 


fin iq 

UU. 1 57 


4343 


O 


HOH 


13 


63.481 


3.211 


57 703 


1 00 


fi9 


4344 


O 


HOH 


14 1 


45.679 


44.833 


38.756 


1 00 

1 • UU 


fin Q7 1 

uu.57 / 


4345 


O 


HOH 


15 


21.388 


1.839 


41.400 


1 00 

1 .UU 


R1 41 

U I .*T I 


4346 


O 


HOH 


16 


47.452 


-16.061 


63.707 


1 00 

I . u u 


fin n 


4347 


O 


HOH 


17 


52.653 


15.955 


63.901 


1.00 


64 7S 

U*T. f U 


4348 


O 


HOH 


18 r 


62.913 


1.964 


67.923 


1.00 


64 H 


4349 


o 


HOH 


19 


62.507 


3.936 


69.792 


1 00 

1 • \J u 


fin ! 


4350 


o 


HOH 


20 


1 1 .730 


26.749 


44.436 


1.00 


6n 7Q 

UU. 1 17 i 


4351 


0 


HOH 


21 


48.735 


13.308 


64.587 


1.00 


62.06 


4352 


o 


HOH 


22 


32.377 


39.863 


58.144 


1.00 


63.51 


4353 


o 


HOH 


23 


58.924 


9.831 


70.947 


1.00 


61.40 


4354 


o 


HOH 


24 


39.278 


17.448 


64.290 


1.00 


62.12 


4355 


o 


HOH 


25 


40.573 


48.042 


36.816 


1.00 


60.96 


4356 


o 


HOH 


26 


40.494 


35.299 


48.387 


1.00 


59.93 


4357 


o 


HOH 


27 


61.454 


1.678 


61.901 


1.00 


60.51 
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4358 


O 


HOH 


28 


9.075 


22.638 


42.296 


1.00 


61.65 


4359 


O 


HOH 


29 


51.369 


13.900 


63.592 


1.00 


64.00 


4360 


O 


HOH 


30 


61.184 


-0.481 


44.937 


1.00 


61.95 


4361 


O 


HOH 


31 


19.041 


16.035 


52.737 


1.00 


60.85 


4362 


O 


HOH 


32 


37.487 


3.963 


49.092 


1.00 


60.40 


4363 


O 


HOH 


33 


31.183 


34.399 


55.395 


1.00 


61.32 


4364 


O 


HOH 


34 


25.672 


33.490 


53.795 


1.00 


61.76 


4365 


O 


HOH 


35 


24.467 


27.177 


45.107 


1.00 


62.37 


4366 


O 


HOH 


36 


47.899 


30.685 


35.691 


1.00 


60.62 


4367 


O 


HOH 


37 


31.250 


45.014 


24.427 


1.00 


63.26 


4368 


O 


HOH 


38 


60.719 


-0.340 


49.987 


1.00 


60.94 


4369 


O 


HOH 


39 


48.761 


14.305 


46.147 


1.00 


59.45 


4370 


O 


HOH 


40 


52.252 


11.824 


45.533 


1.00 


59.86 


4371 


O 


HOH 


41 


40.704 


30.604 


47.765 


1.00 


62.04 


4372 


O 


HOH 


42 


34.599 


19.541 


73.265 


1.00 


61.69 


4373 


O 


HOH 


43 


44.135 


32.951 


48.092 


1.00 


60.11 


4374 


O 


HOH 


44 


16.447 


16.136 


55.224 


1.00 


58.77 


4375 


O 


HOH 


45 


37.470 


21.079 


29.057 


1.00 


61.47 


4376 


O 


HOH 


46 


14.411 


15.785 


52.085 


1.00 


58.97 


4377 


O 


HOH 


47 


27.199 


25.588 


51.919 


1.00 


58.58 


4378 


O 


HOH 


48 


32.466 


25.097 


53.254 


1.00 


60.88 


4379 


o 


HOH 


49 


17.927 


39.612 


49.972 


1.00 


61.48 


4380 


o 


HOH 


50 


17.243 


38.022 


52.339 


1.00 


61.61 


4381 


o 


HOH 


51 


65.714 


6.374 


72.458 


1.00 


61.45 


4382 


o 


HOH 


52 


25.540 


34.686 


57.601 


1.00 


59.81 


4383 


o 


HOH 


53 


22.812 


3.452 


38.767 


1.00 


62.42 


4384 


C1 


DEX 




31.791 


3.330 


56.615 


1 .00 


59.00 


4385 


H1 


DEX 


1 


30.892 


2.719 


56.626 


1.00 


59.00 


4386 


C2 


DEX 


1 


32.066 


4.057 


55.552 


1.00 


59.00 


4387 


H2 


DEX 


1 


31.418 


4.016 


54.717 


1.00 


59.00 


4388 


C3 


DEX 


1 


33.314 


4.929 


55.514 


1.00 


59.00 


4389 


C4 


DEX 


1 


34.176 


5.061 


56.733 


1.00 


59.00 


4390 


H4 


DEX 


1 


35.013 


5.729 


56.720 


1.00 


59.00 


4391 


C5 


DEX 


1 


33.915 


4.329 


57.855 


1.00 


59.00 


4392 


C6 


DEX 


1 


34.782 


4.456 


59.133 


1.00 


59.00 


4393 


H61 


DEX 


1 


35.558 


5.172 


59.015 


1.00 


59.00 


4394 


H62 


DEX 




35.262 


3.483 


59.339 


1.00 


59.00 


4395 


C7 


DEX 


1 


33.905 


4.834 


60.331 


1.00 


59.00 


4396 


H71 


DEX 




33.520 


5.861 


60.202 


1.00 


59.00 


4397 


H72 


DEX 


1 


34.515 


4.837 


61.236 


1.00 


59.00 


4398 


C8 


DEX 




32.690 


3.903 


60.544 


1.00 


59.00 


4399 


H8 


DEX 


1 


33.063 


2.878 


60.787 


1.00 


59.00 


4400 


C9 


DEX 




31.759 


3.803 


59.162 


1.00 


59.00 


4401 


C10 


DEX 




32.677 


3.304 


57.900 


1.00 


59.00 


4402 


C11 


DEX 




30.360 


2.986 


59.327 


1.00 


59.00 


4403 


H11 


DEX 




29.743 


3.203 


58.478 


1.00 


59.00 


4404 


C12 


DEX 




29.599 


3.415 


60.596 


1.00 


59.00 


4405 


H121 


DEX 




28.744 


2.788 


60.729 


1.00 


59.00 
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4406 


H122 


DEX 


1 


29.221 


4 448 


60 4^6 


1 nn 


co nn 

oy.uu 


4407 


C13 


DEX 


•j 


30.518 


3 414 


61 Q94 


1 nn 
I .uu 


cn nn 

oy.uu 


4408 


C14 


DEX 


1 


31.758 


4 387 


61 796 


1 on 
I .uu 


co nn 

oy.uu 


4409 


H14 


DEX 




31.359 


5 40^ 

w.""TWO 


R1 401 


1 on 
1 .uu 


co nn 

oy.uu 


4410 


C15 


DEX 


1 


32.374 


4 58Q 


R** HQS 


1 nn 
I .uu 


co nn 

oy.uu 


4411 


H151 


DEX 


1 


32.893 


5 547 


R^ 111 

UO. 1 1 


1 nn 
I .uu 


co nn 

oy.uu 


4412 


H152 


DEX 


1 


33.1 19 


3 7Qfi 


R^ 9ft1 


1 nn 
I .uu 


co on 
59. UU 


4413 


C16 


DEX 


1 


31.175 


4 488 


R4 OQ*} 

U**.US70 


1 nn 
I .uu 


co nn 

oy.uu 


4414 


H16 


DEX 


1 


31.391 


w.wj 


R4 74 


1 nn 
I .uu 


CO oo 

oy.uu 


4415 


C17 


DEX 


1 


29.863 


4 144 


R^ 1Rfi 

UO. 1 DO 


1 nn 
I .uu 


co on 

oy.uu 


4416 


C18 


DEX 


1 


30.929 


1 834 


R9 ^9^, 


1 nn 
I .uu 


co nn 

oy.uu 


4417 


H181 


DEX 


1 


31.535 


1 833 


R^ 941 


1 nn 
I .uu 


CO oo 

oy.uu 


4418 


H182 


DEX 


1 


30 050 


1 248 


R9 4QR 


1 nn 
I .uu 


CO oo 

oy.uu 


4419 


H183 


DEX 


1 


31.537 


1 374 


R1 


1 nn 
I .uu 


CO oo 

oy.uu 


4420 


C19 


DEX 


•j 


33.270 


1 833 


ni ^ 


1 nn 
l .uu 


co nn 

oy.uu 


4421 


H191 


DEX 


1 


33.916 


1 724 


Rft QOR 


1 nn 
I .uu 


co nn 

oy.uu 


4422 


H192 


DEX 




32.485 


1 095 


Rft 119 

JO. I 1 41 


1 nn 
I .uu 


co on 

oy.uu 


4423 


H193 


DEX 


1 


33.870 


1 605 


R.7 1**4 
O 1 . 1 OH- 


1 nn 
I .uu 


co on 

oy.uu 


4424 


C20 


DEX 




28.759 


3 270 


DO. Of O 


1 nn 
i .UU 


co no 
59. UU 


4425 


C21 


DEX 


-j 


27.338 


3 ^48 


CO oco 
DO.OOO 


1 nn 
1 .UU 


co on 

oy.uu 


4426 


H211 


DEX 


■j 


27.350 


3 6^7 


R9 9ft** 


1 nn 

1 .UU 


co nn 
59. UU 


4427 


H212 


DEX 


•j 


26.827 


4 148 


R^ R7R 
UO.O f U 


1 nn 

1 .uu 


co on 

oy.uu 


4428 


rC22 


DEX 


1 


31 008 


5 RCH 


f\A OA7 
OH.&H f 


*i nn 
i .UU 


co r\n 


4429 


H221 


DEX 


■j 


30.160 




oo.o iy 


4 nn 
1 .uu 


CO OO I 

oy.uu 


4430 


H222 


DEX 




31.912 


5 877 


R*^ 


1 nn 
! .UU 


co oo 

oy.uu 


4431 


H223 


DEX 




30.811 


6 588 


R4 **1 

04.0 I O 


*i nn 
1 .UU 


CO oo 

oy.uu 


4432 


F1 


DEX 


1 


31.331 


5 130 


□o.ooo 


1 nn 
I .uu 


co nn 

oy.uu 


4433 


01 


DEX 1 


-I 


33.617 


5 51? 


^4 RH7 


i nn 
1 .UU 


co nn 

oy.uu 


4434 


02 


DEX 


1 


30.601 


1 580 


**R1 


i nn 
l .UU 


CO oo 

oy.uu 


4435 


H02 


DEX 




29.784 


1 183 


70R 


1 nn 
I .uu 


co oo 

oy.uu 


4436 


03 


DEX 


1 


29.236 


5 409 


R9 71 1 

U^ ./II 


1 nn 
1 .uu 


co nn 

oy.uu 


4437 


H3 


DEX 


1 


28.816 


5 780 


R^ 47^ 


1 nn 
I .uu 


co nn 

oy.uu 


4438 


04 


DEX 


-j 


29.058 


2 511 


R4 R1R 

U*t .O IO 


1 nn 

I .UU 


co nn 

oy.uu 


4439 


05 


DEX 


1 


26.689 


2 117 


fi^ 4Q9 


1 nn 
I .uu 


co nn 

oy.uu 


4440 


H5 


DEX 


-j 


25.816 


2.344 


63 756 


1 on 

I .uu 


nn 
oy.uu 


4441 


C1 


DEX 




21.344 


23.582 


Q7 «04 


1 on 

1 -UU 


co nn 
oy.uu 


4442 


H1 


DEX 




20.325 


23.208 


^7 R^4 

O/ .UO*r 


1 nn 

i .uu 


co nn 
oy.uu 


4443 


C2 


DEX 




22.105 


23.392 


^8 R70 

OO.U / u 


1 nn i 

I .uu 


co nn 
oy.uu 


4444 


H2 


DEX 




21.710 


22 910 


cno 


1 nn 
I .uu 


co nn 
oy.uu 


4445 


C3 


DEX 


1 


23.539 


23 892 


^8 fi«7 

JU.UO r 


1 nn 
I .uu 


co nn 
oy.uu 


4446 


C4 


DEX 


1 


24.137 


24 501 


^7 4^0 

O / .*tJU 


1 nn 

I .uu 


co nn 
oy.uu 


4447 


H4 


DEX 


1 


25.173 


24.791 


37.441 


1.00 


59.00 


4448 


C5 


DEX 




23.372 


24.700 


36.346 


1.00 


59.00 


4449 


C6 


DEX 




23.965 


25.312 


35.061 


1.00 


59.00 


4450 


H61 


DEX 




24.996 


25.542 


35.157 


1.00 


59.00 


4451 


H62 


DEX 




23.444 


26.267 


34.853 


1.00 


59.00 


4452 


C7 


DEX 




23.752 


24.345 


33.877 


1.00 


59.00 


4453 


H71 


DEX 




24.370 


23.444 


34.001 


1.00 


59.00 
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4454 


H72 


DEX 


1 


24.092 


24 829 

^ • W A- W 


32 956 

' m— . WWW 


1 00 

1 .WW 


nn 

*Ji7. wW 


4455 


C8 


DEX 


1 


22.275 


23 885 


33 692 

W W . W W^_ 


1 00 

» • WW 


nn 

WW. WW 


4456 


H8 


DEX 


1 


21.638 


24 764 


33 460 

WW .~w w 


1 00 

1 .WW 


nn 

WW. WW 


4457 


C9 


DEX 


1 


21.676 


23 232 

^> w . w£- 


35 081 

WW. WW 1 


1 00 

1 .WW 


*.q nn 

ww . WW 


4458 


C10 


DEX 


1 


21.819 


24 294 


36 329 

W W . Wfc W 


1 00 

1 .WW 


nn 

WW . WW 


4459 


C11 


DEX 


1 


20 197 


2? 585 

<- £- . WW w 


34 938 

w" . WWW 


1 on 

■ .WW 


nn 

ww. WW 


4460 


H11 


DEX 


1 


20.028 


21 Q74 


35 784 

w w > r w*t 


1 00 

1 .WW 


nn 

ww. WW 


4461 


C12 


DEX 


1 


20.107 


21 699 


33 700 

WW. » WW 


1 00 

1 .WW 


*SQ nn 


4462 


H121 


DEX 


1 


19.130 


21 365 


33 B02 

W W . \J\J£m. 


1 on 

1 . W W 


nn 

w57. UU 


4463 


H122 


DEX 


1 


20.720 


20 795 

^W. 1 W w 


33 859 

WW .Www 


1 on 

1 .WW 


*>q nn 

WW . WW 


4464 


C13 


DEX 


1 


20.600 


22 429 


^2 344 

w^. . w I I 


1 nn 

1 .WW 


nn 


4465 


C14 


DEX 


1 


22.105 


22 863 


32 51 5 

<Jmmm . W 1 W 


1 nn 

1 .WW 


nn 

WW .WW 


4466 


H14 


DEX 


1 


22.701 


21 953 


32 834 


1 nn 

1 .WW 


nn 

WW .WW 


4467 


C15 


DEX 


1 


22 602 


23 242 


31 129 


1 on 

1 .WW 


so nn 

WW . WW 


4468 


H151 


DEX 


1 


23.685 


23 110 


31 097 

w 1 . WW / 


1 nn 

1 . WW 


5Q nn 

WW . WW 


4469 


H152 


DEX 


-j 


22.383 


24 310 


30 934 

WW . W w~ 


1 nn 

1 .WW 


*sq nn 

WW .WW 


4470 


C16 


DEX 


1 


21.806 


22 306 

Mm Cm .WWW 


30 152 

WW. 1 W£— 


1 nn 

1 .WW 


5Q nn 

ww .WW 


4471 


H16 


DEX 


1 


21.207 


22 984 

£— . W W~ 


29 504 

^ w . W W~ 


1 nn 

1 .WW 


*sq nn 

WW .WW 


4472 


C17 


DEX 


1 


20.783 


21 450 

fc. I ."TWW 


31 0Q7 

W 1 .WW 1 


1 nn 

1 .WW 


rq nn 1 

ww .WW 


4473 


C18 


DEX 


1 


19.540 


23 677 

^ W . W f f 


31 944 

w 1 • w ■ ■ 1 


1 nn 

1 .WW 


nn 

ww • ww 


4474 


H181 


DEX 


1 


19.873 


24 157 


31 015 

w 1 . w 1 w 


1 nn 

1 .WW 


*sq nn 

WW .WW 


4475 


H182 


DEX 


1 


18.547 


23 297 


31 792 

W 1.1 W £— 


1 nn 

1 .WW 


5Q nn 

WW .WW 


4476 


H183 


DEX 


1 


19.525 


24 449 


32 700 

\JmC . f WW 


1 nn 

1 .WW 


nn 

ww .WW 


4477 


C19 


DEX 


hj 


20.959 


25 638 


36 205 

WW .mC\J\J 


1 on 

1 .WW 


50 no 

WW . WW 


4478 


H191 


DEX 


1 


21 232 

Mm, 1 • W m^. 


26 215 


35 303 

ww . www 


1 nn 

1 .WW 


*sq nn 

WW .WW 


4479 


H192 


DEX 


1 


19 899 

1 W .WWW 


25 426 


36 127 

WW. 1 mC 1 


1 nn 

1 .WW 


5Q nn 

ww . LI W 


4480 


H193 


DEX 


-j 


21.132 


26 270 

W . ^» I w 


37 072 

w 1 . W # ^- 


1 nn 

1 .WW 


5Q nn 

wC7 . WW 


4481 


C20 


DEX 


hj 


19.417 


21 067 

1 .WW f 


30 421 

W W . ^mC 1 


1 nn 

1 .WW 


5Q nn 


4482 


C21 


DEX 




18.443 


20 176 


31 204 


1 00 

1 .WW 


5Q nn 

ww .WW 


4483 


H211 


DEX 


1 


17.932 


20 800 

4mm W. WWW 


31 959 

w 1 .www 


1 00 

1 .WW 


5Q nn 

WW .WW 


4484 


H212 


DEX 


1 


19.031 


19.423 


31 779 

W 1 . f # w 


1 00 

1 .WW 


59 00 

ww .WW 


4485 


C22 


DEX 


1 


22.671 


21.454 


29.301 


1.00 


59.00 


4486 


H221 


DEX 


^ 


22.061 


20.835 


28.644 


1 00 

1 .WW 


59 00 

ww .WW 


4487 


H222 


DEX 


1 


23.334 


22.077 


28.688 


1.00 


59 00 

ww .WW 


4488 


H223 


DEX 


1 


23.300 


20.785 


29.933 


1.00 


59.00 


4489 


F1 


DEX 




22.519 


22.128 


35 397 

W W . W W f 


1 00 

1 .WW 


59 00 

WW .WW 


4490 


01 


DEX 


^ 


24.201 


23 808 

«— w . w w w 


39 692 

WW « ww^_ 


1.00 


59 00 

ww • WW 


4491 


02 


DEX 


1 


19.179 


23.598 


34.905 


1.00 


59.00 


4492 


H02 


DEX 




18.367 


23.168 


34.580 


1.00 


59.00 


4493 


03 


DEX 




21.444 


20.210 


31 .554 


1.00 


59.00 


4494 


H3 


DEX 




21.502 


L19.648 


30.802 


1.00 


59.00 


4495 


04 


DEX 




19.127 


21.505 


29.299 


1.00 


59.00 


4496 


05 


DEX 




17.530 


19.572 


30.381 


1.00 


59.00 


4497 


H5 


DEX 




17.435 


18.711 


30.744 


1.00 


59.00 
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TABLE 5 

ATOMIC COORDINATES FOR THE GR/SRC-1 MODEL USED IN MOLECULAR 

REPLACEMENT 



A1UM 


ATOM 


KbblDUE 


PROTEIN 

# 


# 


X 


Y 


Z 


occ 


1 


N 


GLN 


527 


-10.228 


40.054 


15.641 


1 on 


fiQ *3fi 


2 


CA 


GLN 


527 


-10.481 


38.584 


15.329 


1 on 


fifi R4 


3 


C 


GLN 


527 


-9.230 


37.821 


15.751 


1 Of) 


fifi 47 


4 


O 


GLN 


527 


-9.189 


37.229 


16.832 


1 on 


OD.OZ 


5 


CB 


GLN 


527 


-10.824 


38.264 


13 878 


1 no 


fift 47 


6 


CG 


GLN 


527 


-11.131 


36.765 


13 555 


1 00 


qq on 


7 


CD 


GLN 


527 


-1 1 .424 


36.357 


12 106 


1 00 


QQ Qn 


8 


OE1 


GLN 


527 


-11.629 


35.191 


11 807 


1 00 


qq Qn 


9 


NE2 


GLN 


527 


-11.432 


37.263 


1 1 161 


1 on 


QQ Qn 


10 


N 


LEU 


528 


-8.211 


37.835 


14 896 


1 00 


fi^ ^n 


11 


CA 


LEU 


528 
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41.30 

43.03 

41.27 

40.09 

39.04 

39.24 

45.64 

44.78 

49.29 

53.51 

53.94 

54.96 


2136 
2137 


OE1 
NE2 


GLN 
GLN 


695 
695 f 


0.874 


67.191 

Of 


35.133 

Oc n A A 

o5.91 1 


1.00 

A r\r\ 

1.00 


55.61 
55.52 


2138 
2139 


C 
O 


GLN 
GLN 


695 
695 


-0 8^4 
-1.931 


66.592 

DD.DDD 

66.813 


35.592 
ol.o99 
32.236 


1.00 

A f\r\ 

1.00 
1.00 


55.62 
55.94 
55.76 


2140 
2141 
2142 


N 

CA 
CB 


GLU i 
GLU 
GLU i 


696 
696 
696 


-0.686 
-1.829 
-2.620 


66.579 
66.652 
67.945 


30.380 
29.470 
29.711 


1.00 
1.00 
1.00 


58.97 
62.61 
63.02 


2143 
2144 


CG 
CD 


GLU 
GLU i 


696 
696 


-1.785 
-0.758 


69.190 
69.598 


30.082 
29.037 


1.00 
1.00 


63.82 
64.13 


2145 
2146 
2147 
2148 
2149 
2150 
2151 


OE1 
OE2 

C 

O 

N 

CA 
C 


GLU 
GLU I 
GLU 
GLU 
GLY 
GLY j 
GLY j 


696 
696 
696 
696 
697 
697 
697 


-1.131 

0.422 

-1.374 

-2.107 

-0.140 I 

0.479 

1.252 


69.774 
69.758 
66.630 
66.233 
67.073 ! 
67.171 ■ 
68.446 | 


27.856 
29.402 
28.019 
27.112 
27.836 
26.535 
26.701 : 


1.00 
1.00 
1.00 
1.00 
1.00 
1.00 
1.00 


64.51 
64.16 
64.98 
65.58 
67.35 
70.03 
71.80 
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2152 


O 


GLY 


697 


1.323 


69.262 


25.756 


1.00 


71.76 


2153 


OXT 


GLY 


697 


1.783 


68.647 


27.817 


1.00 


71.76 



It will be understood that various details of the invention may be changed 
without departing from the scope of the invention. Furthermore, the foregoing 
description is for the purpose of illustration only, and not for the purpose of 
5 limitation — the invention being defined by the claims. 
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CLAIMS 

What is claimed is: 

1- A method of modifying a test NR polypeptide, the method 
comprising: 

(a) providing a test NR polypeptide sequence having a characteristic 
that is targeted for modification; 

aligning the test NR polypeptide sequence with at least one 
reference NR polypeptide sequence for which an X-ray structure is 
available, wherein the at least one reference NR polypeptide 
sequence has a characteristic that is desired for the test NR 
polypeptide; 

(c) building a three-dimensional model for the test NR polypeptide using 
the three-dimensional coordinates of the X-ray structure(s) of the at 
least one reference polypeptide and its sequence alignment with the 

1 5 test NR polypeptide sequence; 

(d) examining the three-dimensional model of the test NR polypeptide 
for a difference in an amino acid residue as compared to the at least 
one reference polypeptide, wherein the residues are associated with 
the desired characteristic; and 

20 (c) mutating an amino acid residue in the test NR polypeptide sequence 

located at a difference identified in step (d) to a residue associated 
with the desired characteristic, whereby the test NR polypeptide is 
modified. 

25 2 - The method of cla «m 1. wherein the reference NR polypeptide 

sequence is a PR sequence, and wherein the test polypeptide sequence is a GR 
polypeptide sequence. 

3. The method of claim 1 , wherein the polypeptide of a crystalline GR 
30 LBD is used as the reference polypeptide sequence. 

4. The method of claim 1. wherein the method is carried out in a 
bacterial expression system. 
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5. The method of claim 1 , wherein the bacteria is E. coli 

6. A method for modifying a test NR polypeptide to improve the 
5 solubility, stability in solution and other solution behavior, to alter and preferably 

improve the folding and stability of the folded structure, to alter and preferably 
improve the ability to form ordered crystals, or combination thereof, the method 
comprising: 

(a) providing a test NR polypeptide sequence for which the solubility, 
1 0 stability in solution, other solution behavior, tendency to fold 

properly, ability to form ordered crystals, or combination thereof is 
different from that desired; 

(b) aligning the test NR polypeptide sequence with the sequences of 
one or more reference NR polypeptides for which the X-ray structure 

15 is available and for which the solution properties, folding behavior 

and crystallization properties are closer to those desired; 

(c) building a three-dimensional model for the test NR polypeptide using 
the three-dimensional coordinates of the X-ray structure(s) of the 
one or more of reference polypeptides and their sequence alignment 

20 with the test NR polypetide sequence; 

(d) examining the three-dimensional model of the test NR polypeptide 
for lipophilic side-chains that are exposed to solvent, for clusters of 
two or more lipophilic side-chains exposed to solvent, for lipophilic 
pockets and clefts on the surface of the protein model, for sites on 

25 the surface of the protein model that are more lipophilic than the 

corresponding sites on the structure(s) of the reference NR 
polypeptide(s), or combinations thereof; 

(e) for each residue identified in step (d), mutating the amino acid to an 
amino acid with different hydrophilicity, whereby the exposed 

30 lipophilic sites are reduced, and the solution properties improved; 

(f) examining the three-dimensional model at each site where the 
amino acid in the test NR polypeptide is different from the amino 
acid at the corresponding position in the reference NR polypeptide, 
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and checking whether the amino acid in the test NR polypeptide 
makes favorable interactions with the atoms that lie around it in the 
three-dimensional model, considering the side-chain conformations 
predicted in step (c), considering alternative conformations of the 
side-chains, considering the presence of water molecules, or 
combinations thereof; 

for each residue identified in step (0 as not making favorable 
interactions with the atoms that lie around it, mutating the residue to 
another amino acid that makes favorable interactions with the atoms 
that lie around it, thereby promoting the tendency for the test NR 
polypeptide to fold into a stable structure with improved solution 
properties, less tendency to unfold, and greater tendency to form 
ordered crystals; 

examining the three-dimensional model at each residue position 
where the amino acid in the test NR polypeptide is different from the 
amino acid at the corresponding position in the reference NR 
polypeptide, and checking whether the steric packing, hydrogen 
bonding and other energetic interactions could be improved by 
mutating that residue or any one or more of the surrounding 
residues lying within 8 angstroms in the three-dimensional model; 
(i) for each residue position identified in step (h) as potentially allowing 
an improvement in the packing, hydrogen bonding and energetic 
interactions, mutating those residues individually or in combination 
to residues that improve the packing, hydrogen bonding, energetic 
interactions, and combinations thereof, thereby promoting the 
tendency for the test NR polypeptide to fold into a stable structure 
with improved solution properties, less tendency to unfold, and 
greater tendency to form ordered crystals. 



(h) 



7. 



The method of claim 6, further comprising optimizing the side-chain 
conformations in the three-dimensional model of the test NR polypeptide by 
generating many alternative side-chain conformations, refining by energy 
minimization, and selecting side-chain conformations with lower energy. 
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8. The method of claim 6, wherein the mutating of step (e) further 
comprises a mutation to a more hydrophilic amino acid. 

5 9. The method of claim 6, wherein the reference NR polypeptide is PR, 

and wherein the test NR polypeptide is GRa 

10. The method of claim 6, wherein the reference NR polypeptide is 
10 GRa, and wherein the test NR polypeptide is GRp or MR. 

11. The method of claim 6, wherein the method is carried out in a 
bacterial expression system. 

15 12. The method of claim 6, wherein the bacteria is E. coli. 

13. An isolated GR polypeptide comprising a mutation in a ligand 
binding domain, wherein the mutation alters the solubility of the ligand binding 
domain. 

20 

14. An isolated GR polypeptide, or functional portion thereof, having one 
or more mutations comprising a substitution of a hydrophobic amino acid residue 
by a hydrophilic amino acid residue in a ligand binding domain. 

25 15. The isolated polypeptide of claims 13 or 14, wherein the mutation is 

at a residue selected from the group consisting of V552, W557, F602, L636, Y648, 
W712, L741, L535, V538, C638, M691, V702, Y648, Y660, L685, M691, V702, 
W712, L733, Y764 and combinations thereof. 

30 16. The isolated polypeptide of claims 13 or 14, wherein the mutation is 

selected from the group consisting of V552K, W557S, F602S, F602D, F602E, 
F602Y, F602T, F602N, F602C, L636E, Y648Q, W712S, L741R, L535T, V538S, 
C638S, M691T, V702T, W712T and combinations thereof. 
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17. An isolated GR LBD polypeptide, or functional portion thereof 
hav,ng a F602S mutation or a F602D mutation, or a phenylalanine to serine or 
Phenylalanine to aspartic acid mutation at an analogous position in the sequence 
•n any polypeptide based on sequence alignment to GRa. 

18. The isolated polypeptide of claim 17, wherein the polypeptide has 
the sequence of SEQ ID NO:12 or 14. 



1 9. An isolated nucleic acid molecule encoding a GR polypeptide of any 
of claims 13-18. 

20. A chimeric gene, comprising the nucleic acid molecule of claim 19 
operably linked to a heterologous promoter. 

21 . A vector comprising the chimeric gene of claim 20. 

22. A host cell comprising the chimeric gene of claim 20. 

23. A method of detecting a nucleic acid molecule that encodes a GR 
polypeptide, the method comprising: 

(a) procuring a biological sample comprising nucleic acid material; 

(b) hybridizing the nucleic acid molecule of claim 19 under stringent 
hybridization conditions to the biological sample of (a), thereby 
forming a duplex structure between the nucleic acid of claim 19 and 
a nucleic acid within the biological sample; and 

(c) detecting the duplex structure of (b), whereby a GR encoding nucleic 
acid molecule is detected. 
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24. An antibody that specifically recognizes a GR polypeptide of any of 
claims 13-18. 



25. A method for producing an antibody that specifically recognizes a 
GR polypeptide, the method comprising: 

(a) recombinantly or synthetically producing a GR polypeptide of any of 
claims 13-18, or portion thereof; 

(b) formulating the polypeptide of (a) whereby it is an effective 
immunogen; 

(c) administering to an animal the formulation of (b) to generate an 
immune response in the animal comprising production of antibodies, 
wherein antibodies are present in the blood serum of the animal; and 

(d) collecting the blood serum from the animal of (c), the blood serum 
comprising antibodies that specifically recognize a GR polypeptide. 



26. A method for detecting a level of GR polypeptide, the method 
comprising: 

(a) obtaining a biological sample comprising peptidic material; and 

(b) detecting a GR polypeptide in the biological sample of (a) by 
immunochemical reaction with the antibody of claim 24, whereby an 
amount of GR polypeptide in a sample is determined. 



27. A method for identifying a substance that modulates GR LBD 
function, the method comprising: 

(a) isolating a GR LBD polypeptide of any of claims 13-18; 

(b) exposing the isolated GR polypeptide to a plurality of substances; 

(c) assaying binding of a substance to the isolated GR polypeptide; and 
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(d) selecting a substance that demonstrates specific binding to the 
isolated GR LBD polypeptide. 

28. A substantially pure GR ligand binding domain polypeptide in 
5 crystalline form. 

29. The polypeptide of claim 28, wherein the crystalline form comprises 
latt.ce constants of a = b =126.014 A, c = 86.312 A, a = 90°, p = 90°, y = 120°. 



30. The polypeptide of claim 28 or 29, wherein the crystalline form 
hexagonal crystalline form. 



31 . The polypeptide of claim 28 or 29, wherein the crystalline form has a 
space group of P61. 

15 

32. The polypeptide of claim 28 or 29, wherein the GRa ligand binding 
domain polypeptide has the amino acid sequence shown in any one of SEQ ID 
NOs:12, 14, 16 and 31. 

20 33. The polypeptide of claim 28 or 29, wherein the GR ligand binding 

domain polypeptide is in complex with a ligand. 

34. The polypeptide of claim 33, wherein the ligand is a steroid. 

25 35. The polypeptide of claim 34, wherein the steroid is dexamethasone. 

36. The polypeptide of claim 28 or 29, wherein the GR ligand binding 
domain polypeptide is in complex with a ligand and a peptide. 

30 37. The polypeptide of claim 36, wherein the ligand is a steroid. 

38. The polypeptide of claim 37, wherein the steroid is dexamethasone. 
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39. The polypeptide of claim 38, wherein the ligand is a steroid and the 
peptide is a fragment of a co-activator. 

40. The polypeptide of claim 36, wherein the ligand is a steroid and the 
5 peptide is a fragment of a co-repressor. 

41. The polypeptide, of claim 36, wherein the ligand is dexamethasone 
and the 

peptide comprises an LXXLL (SEQ ID NO: 18) motif. 

10 

42. The polypeptide of claim 36, wherein the peptide is a fragment of a 
TIF2 protein. 

43. The polypeptide of claim 42, wherein the ligand is dexamethasone 
15 and the peptide has the amino acid sequence shown in any one of SEQ ID 

NO:17. 

44. The polypeptide of claim 28 or 29, wherein the GR ligand binding 
domain has a crystalline structure further characterized by the atomic structure 

20 coordinates shown in Table 4. 

45. The polypeptide of claim 28 or 29, wherein the crystalline form 
contains two GRa ligand binding domain polypeptide in the asymmetric unit. 

25 46. The polypeptide of claim 28 or 29, wherein the crystalline form is 

such that the three-dimensional structure of the crystallized GR ligand binding 
domain polypeptide can be determined to a resolution of about 2.8 A or better. 

47. The polypeptide of claim 28 or 29, wherein the crystalline form 
30 contains one or more atoms having a molecular weight of 40 grams/mol or 
greater. 
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48. A method for determining the three-dimensional structure of 
crystallized GR ligand binding domain polypeptide to a resolution of about 2.8 A o" 
better, the method comprising: 

(a) crystallizing a GR ligand binding domain polypeptide; and 
5 (b) analyzing the GR ligand binding domain polypeptide to determine 

the three-dimensional structure of the crystallized GR ligand binding 
domain polypeptide, whereby the three-dimensional structure of a 
crystallized GR ligand binding domain polypeptide is determined to a 
resolution of about 2.8 A or better 

10 

49. The method of claim 48, wherein the analyzing is by X-ray 
diffraction. 



50. The method of claim 48, wherein the crystallization is accomplished 
by the hanging drop method, and wherein the GR ligand binding domain is mixed 
with a reservoir. 

51. The method of claim 50, wherein the reservoir comprises 50mM 
HEPES, pH 7.5-8.5, and 1.7-2.3M ammonium formate. 



20 



25 



52. The method of claim 48, wherein the crystallizing further comprises 
crystallizing the GRct ligand binding domain with a ligand and a peptide. 

53. The method of claim 52, wherein the ligand is a steroid. 

54. The method of claim 53. wherein the ligand is dexamethasone. 



55. The method of claim 52, wherein the ligand is a steroid and the 
peptide is a fragment of a co-activator. 

30 

56. The method of claim 52, wherein the ligand is a steroid and the 
peptide is a fragment of a co-repressor. 
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57. The method of claim 52, wherein the ligand is dexamethasone and 

the 

peptide comprises an LXXLL (SEQ ID NO: 18) motif. 

4 

5 58. The method of claim 52, wherein the peptide is a fragment of a TIF2 

protein. 

59. The method of claim 52, wherein the ligand is dexamethasone and 
the peptide has the amino acid sequence shown in SEQ ID NO: 17. 

10 

60. A method of generating a crystallized GR ligand binding domain 
polypeptide, the method comprising: 

(a) incubating a solution comprising a GR ligand binding domain with a 
reservoir; and 

1 5 (b) crystallizing the GR ligand binding domain polypeptide using the 

hanging drop method, whereby a crystallized GR ligand binding 
domain polypeptide is generated. 

61. The method of claim 60, wherein the incubating further comprises 
20 incubating the GR ligand binding domain with a ligand and a peptide. 

62. The method of claim 61 , wherein the ligand is a steroid. 

63. The method of claim 62, wherein the steroid is dexamethasone. 

25 

64. The method of claim 61, wherein the ligand is a steroid and the 
peptide is a fragment of a co-activator. 

65. The method of claim 61, wherein the ligand is a steroid and the 
30 peptide is a fragment of a co-repressor. 

66. The method of claim 61, wherein the ligand is dexamethasone and 
the peptide comprises an LXXLL (SEQ ID NO: 18) motif. 
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67. The method of claim 61 , wherein the peptide is a fragment of a TIF2 
protein. 



68. A crystallized GRa ligand binding domain polypeptide produced by 
the method of claim 60. 

69. A method of designing a modulator of a nuclear receptor, the 
method comprising: 

(a) designing a potential modulator of a nuclear receptor that will make 
interactions with amino acids in the ligand binding site of the nuclear 
receptor based upon the atomic structure coordinates of a GR ligand 
binding domain polypeptide; 

(b) synthesizing the modulator; and 

(c) determining whether the potential modulator modulates the activity 
of the nuclear receptor, whereby a modulator of a nuclear receptor is 
designed. 

70. The method of claim 69, wherein the atomic structure coordinates 
20 further comprises a ligand and a peptide bound to the GR ligand binding domain 

polypeptide. 

71. The method of claim 69, wherein the atomic structure coordinates 
are the atomic structural coordinates shown in Table 3. 



15 



25 



30 



72. The method of claim 70, wherein the ligand is a steroid. 

73. The method of claim 72, wherein the steroid is dexamethasone. 

74. The method of claim 70, wherein the ligand is a steroid and the 
peptide is a fragment of a co-activator. 
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75. The method of claim 70, wherein the ligand is a steroid and the 
peptide is a fragment of a co-repressor. 

76. The method of claim 70, wherein the ligand is dexamethasone and 
5 the peptide comprises an LXXLL (SEQ ID NO: 18) motif. 

77. The method of claim 70, wherein the peptide is a fragment of a TIF2 
protein. 

10 78. A method of designing a modulator that selectively modulates the 

activity of a GRa polypeptide the method comprising: 

(a) obtaining a crystalline form of a GRa ligand binding domain 
polypeptide; 

(b) determining the three-dimensional structure of the crystalline form of 
15 the GRa ligand binding domain polypeptide; and 

(c) synthesizing a modulator based on the three-dimensional structure 
of the crystalline form of the GRa ligand binding domain polypeptide, 
whereby a modulator that selectively modulates the activity of a 
GRa polypeptide is designed. 

20 

79. The method of claim 78, wherein the method further comprises 
contacting a GRa ligand binding domain polypeptide with the potential modulator; 
and assaying the GRa ligand binding domain polypeptide for binding of the 
potential modulator, for a change in activity of the GRa ligand binding domain 

25 polypeptide, or both. 

80. The method of claim 78, wherein the crystalline form is a hexagonal 

form. 

30 81. The method of claim 80, wherein the crystals are such that the 

three-dimensional structure of the crystallized GRa ligand binding domain 
polypeptide can be determined to a resolution of about 2.8 A or better. 
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82. The method of claim 78, wherein the crystalline form comprises a 
GRa ligand binding domain with a ligand and a peptide. 

5 83. The method of claim 82, wherein the ligand is a steroid. 

84. The method of claim 83, wherein the steroid is dexamethasone. 

85. The method of claim 82, wherein the ligand is a steroid and the 
1 0 peptide is a fragment of a co-activator. 

86. The method of claim 82, wherein the ligand is a steroid and the 
peptide is a fragment of a co-repressor. 

5 87. The method of claim 82, wherein the ligand is dexamethasone and 

the peptide comprises an LXXLL (SEQ ID NO:18) motif. 

88. The method of claim 82, wherein the peptide is a fragment of a TIF2 
protein. 

0 

89. The method of claim 78, wherein the three-dimensional structure of 
the crystalline form of the GRa ligand binding domain polypeptide is described by 
the atomic coordinates shown in Table 4. 

3 90. A method of screening a plurality of compounds for a modulator of a 

GR ligand binding domain polypeptide, the method comprising: 

(a) providing a library of test samples; 

(b) contacting a GR ligand binding domain polypeptide with each test 
sample; 

(c) detecting an interaction between a test sample and the GR ligand 
binding domain polypeptide; 

(d) identifying a test sample that interacts with the GR ligand binding 
domain polypeptide; and 
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(e) isolating a test sample that interacts with the GR ligand binding 
domain polypeptide, whereby a plurality of compounds is screened 
for a modulator of a GR ligand binding domain polypeptide. 

91. The method of claim 90, wherein the test samples are bound to a 
substrate. 

92. The method of claim 90, wherein the test samples are synthesized 
directly on a substrate. 

93. A method for identifying a GR modulator, the method comprising: 

(a) providing atomic coordinates of a GR ligand binding domain to a 
computerized modeling system;and 

(b) modeling ligands that fit spatially into the binding pocket of the GR 
ligand binding domain to thereby identify a GR modulator, whereby a 
GR modulator is identified. 

94. The method of claim 93, wherein the method further comprises 
identifying in an assay for GR-mediated activity a modeled ligand that increases or 
decreases the activity of the GR. 

95. The method of claim 93, wherein the atomic coordinates are the 
atomic coordinates shown in Table 4. 

96. A method of identifying modulator that selectively modulates the 
activity of a GRa polypeptide compared to other GR polypeptides, the method 
comprising: 

(a) providing atomic coordinates of a GRa ligand binding domain to a 
computerized modeling system; and 

(b) modeling a ligand that fits into the binding pocket of a GRa ligand 
binding domain and that interacts with conformationally constrained 
residues of a GRa conserved among GR subtypes, whereby a 
modulator that selectively modulates the activity of a GRa 
polypeptide compared to other polypeptides is identified. 
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97. The method of claim 96, wherein the method further comprises 
identifying in a biological assay for GR activity a modeled ligand that selectively 
binds to GRa and increases or decreases the activity of said GRcc. 

5 

98. The method of claim 96. wherein the atomic coordinates are the 
atomic coordinates shown in Table 4. 



99. A method of designing a modulator of a GR polypeptide, the method 
comprising: 

(a) selecting a candidate GR ligand; 

determining which amino acid or amino acids of a GR polypeptide 
interact with the ligand using a three-dimensional model of a 
crystallized protein comprising a GRa LBD; 

identifying in a biological assay for GR activity a degree to which the 
ligand modulates the activity of the GR polypeptide; 
selecting a chemical modification of the ligand wherein the 
interaction between the amino acids of the GR polypeptide and the 
ligand is predicted to be modulated by the chemical modification; 
synthesizing a chemical compound with the selected chemical 
modification to form a modified ligand; 
contacting the modified ligand with the GR polypeptide; 
identifying in a biological assay for GR activity a degree to which the 
modified ligand modulates the biological activity of the GR 
polypeptide; and 

comparing the biological activity of the GR polypeptide in the 
presence of modified ligand with the biological activity of the GR 
polypeptide in the presence of the unmodified ligand, whereby a 
modulator of a GR polypeptide is designed. 

100. The method of claim 99, wherein the GR polypeptide is a GRa 
polypeptide. 



(b) 

5 (c) 
(d) 

(e) 
(f) 

(g) 

(h) 
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101. The method of claim 99, wherein the three-dimensional model of a 
crystallized protein is a GRct ligand binding domain with a ligand and a peptide. 

102. The method of claim 101 , wherein the ligand is a steroid. 

5 

1 03. The method of claim 1 01 , wherein the steroid is dexamethasone. 

104. The method of claim 101, wherein the ligand is a steroid and the 
peptide is a fragment of a co-activator. 

10 

105. The method of claim 101, wherein the ligand is a steroid and the 
peptide is a fragment of a co-repressor. 

106. The method of claim 101, wherein the ligand is dexamethasone and 
1 5 the peptide comprises an LXXLL (SEQ ID NO: 18) motif. 

107. The method of claim 101, wherein the peptide is a fragment of a 
TIF2 protein. 

20 

108. The method of claim 99, wherein the three-dimensional model is 
represented by the three dimensional coordinates shown in Table 4. 

109. The method of claim 99, wherein the method further comprises 
25 repeating steps (a) through (f), if the biological activity of the GR polypeptide in the 

presence of the modified ligand varies from the biological activity of the GR 
polypeptide in the presence of the unmodified ligand. 

110. An assay method for identifying a compound that inhibits binding of 
30 a ligand to a GR polypeptide, the assay method comprising: 

(a) designing a test inhibitor compound based on the three dimensional 
atomic coordinates of GR; 

(b) incubating a GR polypeptide with a ligand in the presence of a test 
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inhibitor compound; 

determining an amount of ligand that is bound to the GR 
polypeptide, wherein decreased binding of ligand to the GR protein 
in the presence of the test inhibitor compound relative to binding of 
ligand in the absence of the test inhibitor compound is indicative of 
inhibition; and 

(d) identifying the test compound as an inhibitor of ligand binding if 
decreased ligand binding is observed, whereby a compound that 
inhibits binding of a ligand to a GR polypeptide is identified. 

111. The method of claim 1 1 0, wherein the ligand is a steroid. 

112. The method of claim 111, wherein the steroid is dexamethasone. 

113. The method of claim 1 10, wherein the three dimensional coordinates 
are the three dimensional coordinates shown in Table 4. 



114. A method of identifying a NR modulator that selectively modulates 
the biological activity of one NR compared to GRa, the method comprising: 
20 (a) providing an atomic structure coordinate set describing a GRa 

ligand binding domain structure and at least one other atomic 
structure coordinate set describing a NR ligand binding domain, 
each ligand binding domain comprising a ligand binding site; 

(b) comparing the atomic structure coordinate sets to identify at least 
25 one diference between the sets; 

(c) designing a candidate ligand predicted to interact with the difference 
of step (b); 

(d) synthesizing the candidate ligand; and 

(e) testing the synthesized candidate ligand for an ability to selectively 
30 modulate a NR as compared to GRa. whereby a NR modulator that 

selectively modulates the biological activity NR compared to GRa is 
identified. 
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115. The method of claim 114, wherein the GRa atomic structure 
coordinate set is the atomic structure coordinate set shown in Table 4. 

116. The method of claim 114, wherein the NR is selected from the group 
consisting of MR, PR, AR, GRP and isoforms thereof that have ligands that also 
bind GRa. 
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Time Course of GST-GR LBD (F602S) 
Binding to F-Dexamethasone 
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SEQUENCE LISTING 



<110> Xu, Eric 

Bledsoe, Randy K . 
Montana, Valerie G. 
McKee, David D. 
Pearce, Kenneth H. 
Stanley, Thomas B. v * 
Apolito, Christopher J. 
Lambert, Millard H. 

<120> CRYSTALLIZED GLUCOCORTICOID RECEPTOR L I G AN D BINDING DOMAIN POLYPEPTIDE AND SCREENING 
METHODS EMPLOYING SAME 

<130> Docket No. PU4523 

<140> 
<141> 

<150> 60/305,902 
<151> 2001-07-17 

<160> 41 

<170> Patentln version 3.1 

<210> 1 

<211> 2334 

<212> DNA 

<213> Homo sapiens 

<220> 

<221> CDS 

<222> (1)..(2334) 

<223> 

<400> 1 

atg gac tec aaa gaa tea tta act cct ggt aga gaa gaa aac ccc age 48 
Met Asp Ser Lys Glu Ser Leu Thr Pro Gly Arg Glu Glu Asn Pro Ser 
15 10 15 
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144 



192 



tlr 111 til III 1% Hi !?* S £! £2 r 2 s ; at aaa acc 

20 y Cc Asp Phe T y r L y s Thr 

25 30 

Si E ffi lit 111 £ S2 LyI £ £ S Ser J" ^ «*' 
35 ^ Ser Ser Pro Se ^ Leu 

40 45 

si & in ss § aa „ ss r S p s in m tit in «t.tt, fl « ff .t * 

50 cc y c * An Ar 9 Ar 9 Leu Leu Val Asp 

ib 60 

Phe Pro tyt HI £j ^ £ « cag cag oca gat Ctg tec aaa 

65 y V ^ S6r Asn Ala Gln Gin Pro Asp Leu Ser Lys 

75 80 

s s s a s - sg a « ss a - c - s ». 

90 95 

a s s s s a sg k s a a as sk s K £ « 

105 110 

£ a g k s; s s a s a s e s ;2 e is - 

120 125 

Leu Asn A% 9 52 ^ £ ".'J ™ Glu a s a n p" ^ ^ 

130 ifc Glu Asn Pro Ser Ser Ala Ser 

135 140 

?hr SI £1 ttl S Sa C Pro Thr gf 9 f* 9 9 ? 9 tfct C " "« * Ct c ~ 
145 75 PhC Pr ° LyS Thr His 

155 160 

k 22 s s £ c a « a » is a s sg s 52e 

170 175 

£ !g S K 31! S £2 £ S s 5 « E - K j- ». 

185 190 

"e Leu SS S £ 9 Tlu Phe Ser Ser V* S" 99t ^ 

195 Ser Gly Ser Pro Gly Lys Glu Thr 

200 205 

£ B 2 £ 5 S S 5 2 £2 S s C E S 22 " 

215 220 

Leu Ser Pro Leu S2 Sv Glu SIS S" ttC ^ " 9 gaa aac "0 

225 230 P L6U LeU Glu Asn 

235 240 

£ ^ in it; t% til til ;s Le ta r 9 ? ac act aaa ccc aaa 

245 ASP Thr Lys Pro L ys 

250 255 

S S K «|» SS S ffi 2S K E s III K 53 s s " 

265 270 

2: a si e « e •« K K « s e «. s « .„ 

280 285 

Pro I? 9 ?al IS " 9 HI ttl 9 ? C 9tt tac W c«9 9ca 

290 V LyS Leu Gly Thr Val Tyr Cys Gin Ala 

295 300 

Ser Phe Pro oty As'n 52 il" T " a 3t9 tCt 9CC tct 

305 ^ e Ile Gly Asn L V S Met Ser Ala lie Ser 



310 ' 315 



320 
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gtt cat ggt gtg agt acc tct gga gga cag atg tac cac tat gac atg 1008 
Val His Gly Val ser Thr Ser Gly Gly Gin Met Tyr His Tyr Asp Met 
325 330 335 

aat aca gca tec ctt tct caa cag cag gat cag aag cct att ttt aat 1056 
Asn Thr Ala Ser Leu Ser Gin Gin Gin Asp Gin Lys Pro He Phe Asn 
340 345 350 

gtc att cca cca att ccc gtt ggt tec gaa aat tgg aat agg tgc caa 1104 
Val He Pro Pro He Pro Val Gly Ser Glu Asn Trp Asn Arg Cys Gin 
355 360 365 

gga tct gga gat gac aac ttg act tct ctg ggg act ctg aac ttc cct 1152 
Gly Ser Gly Asp Asp Asn Leu Thr Ser Leu Gly Thr Leu Asn Phe Pro 
370 375 380 

ggt cga aca gtt ttt tct aat ggc tat tea age ccc age atg aga cca 1200 
Gly Arg Thr Val Phe Ser Asn Gly Tyr Ser Ser Pro Ser Met Arg Pro 
385 390 395 400 

gat gta age tct cct cca tec age tec tea aca gca aca aca gga cca 1248 
Asp Val Ser Ser Pro Pro Ser Ser Ser Ser Thr Ala Thr Thr Gly Pro 
405 410 415 

cct ccc aaa etc tgc ctg gtg tgc tct gat gaa get tea gga tgt cat 1296 
Pro Pro Lys Leu Cys Leu Val Cys Ser Asp Glu Ala Ser Gly Cys His 
420 425 430 

tat gga gtc tta act tgt gga age tgt aaa gtt ttc ttc aaa aga gca 1344 
Tyr Gly Val Leu Thr Cys Gly Ser Cys Lys Val Phe Phe Lys Arg Ala 
435 440 445 

gtg gaa gga cag cac aat tac eta tgt get gga agg aat gat tgc ate 1392 
Val Glu Gly Gin His Asn Tyr Leu Cys Ala Gly Arg Asn Asp Cys He 
450 455 460 

ate gat aaa att cga aga aaa aac tgc cca gca tgc cgc tat cga aaa 1440 
He Asp Lys He Arg Arg Lys Asn Cys Pro Ala Cys Arg Tyr Arg Lys 
465 470 475 480 

tgt ctt cag get gga atg aac ctg gaa get cga aaa aca aag aaa aaa 1488 
Cys Leu Gin Ala Gly Met Asn Leu Glu Ala Arg Lys Thr Lys Lys Lys 
485 490 495 

ata aaa gga att cag cag gee act aca gga gtc tea caa gaa acc tct 1536 
lie Lys Gly He Gin Gin Ala Thr Thr Gly Val Ser Gin Glu Thr Ser 
500 505 510 

gaa aat cct ggt aac aaa aca ata gtt cct gca acg tta cca caa etc 1584 
Glu Asn Pro Gly Asn Lys Thr He Val Pro Ala Thr Leu Pro Gin Leu 
515 520 525 

acc cct acc ctg gtg tea ctg ttg gag gtt att gaa cct gaa gtg tta 1632 
Thr Pro Thr Leu Val Ser Leu Leu Glu val He Glu Pro Glu Val Leu 
530 535 540 

tat gca gga tat gat age tct gtt cca gac tea act tgg agg ate atg 1680 
Tyr Ala Gly Tyr Asp Ser Ser Val Pro Asp Ser Thr Trp Arg He Met 
545 550 555 560 

act acg etc aac atg tta gga ggg egg caa gtg att gca gca gtg aaa 1728 
Thr Thr Leu Asn Met Leu Gly Gly Arg Gin Val He Ala Ala Val Lys 
565 570 575 



tgg gca aag gca ata cca ggt ttc agg aac tta cac ctg gat gac caa 1776 

Trp Ala Lys Ala He Pro Gly Phe Arg Asn Leu His Leu Asp Asp Gin 
580 585 590 

atg acc eta ctg cag tac tec tgg atg ttt ctt atg .gca ttt get ctg 1824 

Met Thr Leu Leu Gin Tyr Ser Trp Met Phe Leu Met Ala Phe Ala Leu 
595 600 605 

ggg tgg aga tea tat aga caa tea agt gca aac ctg ctg tgt ttt get 1872 

Gly Trp Arg Ser Tyr Arg Gin Ser Ser Ala Asn Leu Leu Cys Phe Ala 
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610 

cct gat ctg att att 
Pro Asp Leu lie lie 
625 

gac caa tgt aaa cac 
Asp Gin Cys Lys His 
645 

cag gta tct tat gaa 
Gin Val ser Tyr Glu 
660 

tct tea gtt cct aag 
Ser Ser Val Pro Lys 
675 

att aga atg acc tac 
He Arg Met Thr Tyr 
690 
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615 



620 



gaa gg a aac tec age 
Glu Gly Asn Ser Ser 
705 

etc ttg gat tct atg 
Leu Leu Asp Ser Met 
725 

ttc caa aca ttt ttq 
Phe Gin Thr Phe Leu 
740 

tta get gaa ate ate 
Leu Ala Glu lie He 
755 

ate aaa aaa ctt ctg 
He Lys Lys Leu Leu 
770 



<210> 2 

<211> 777 

<212> prt 

<213> Homo sapiens 



It " 9 aga atg act cta ccc tgc atg tac 

Asn Glu Gin Arg Met Thr Leu Pro Cys Me? Tyr 
635 640 

£-? T ctg * at gtt tcc tct gag tta cac a s<? ctt 

Met Leu Tyr Val ser Ser Glu Leu His Arg Leu 
650 655 

G?u Tvr f tC J 9t at9 383 acC tta ctt * ctt etc 
Glu Tyr Leu Cys Met Lys Thr Leu Leu Leu Leu 

665 670 

HI C 9 ll T Ct9 gag *** 

Asp Gly Leu Lys Ser Gin Glu Leu Phe Asp Glu 
680 685 

ate aaa gag cta gga aaa gee a :t gtc aag agg 
He Lys Glu Leu Gly Lys Ala He Val Lys Arg 
695 7Q0 

r?^ J" 039 Cgg tfct tat caa ct <? a ca aaa 

Gin Asn Trp Gin Arg Phe Tyr Gin Leu Thr Lys 

715 720 

cat gaa gtg gtt gaa aat etc ctt aac tat tgc 
His Glu Val val Glu Asn Leu Leu Asn Tyr Cys 
730 735 

III f 39 tu° atg agt att gaa ttc ccc atg 
Asp Lys Thr Met Ser He Glu Phe Pro Glu Met 

7 «5 750 

acc aat cag ata cca aaa tat tea aat gga aat 
Thr Asn Gin He Pro Lys Tyr Ser Asn Gly Asn 
760 765 

ttt cat caa aag tga 
Phe His Gin Lys 
775 



1920 



1968 



2016 



2064 



2112 



2160 



2208 



2256 



2304 



2334 



<400> 2 

Met Asp ser Lys Glu Ser Leu Thr Pro Gly Arg G l„ G lu Asn Pro ser 



10 



15 



Ser Val Leu Ala Gin Glu Arg Gly Asp Val Met Asp Phe Tyr Lys Thr 

25 30 

Leu Arg Gly Gly Ala Thr Val Lys Val Ser Ala Ser Ser Pro Ser Leu 

40 45 

Ala val Ala Ser Gin Ser Asp Ser Lys Gin Arg Arg Leu Leu Val Asp 

55 60 

Phe Pro Lys Gly Ser Val Ser Asn Ala Gin Gin Pro Asp Leu Ser Lys 

75 80 

Ala Val ser Leu Ser Met Gly Leu Tyr Met Gly Glu Thr Glu Thr Lys 
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85 90 95 

Val Met Gly Asn Asp Leu Gly Phe Pro Gin Gin Gly Gin He Ser Leu 
100 105 110 

Ser Ser Gly Glu Thr Asp Leu Lys Leu Leu Glu Glu Ser He Ala Asn 
115 120 125 

Leu Asn Arg Ser Thr Ser Val Pro Glu Asn Pro Lys Ser Ser Ala Ser 
130 135 140 

Thr Ala Val Ser Ala Ala Pro Thr Glu Lys Glu Phe Pro Lys Thr His 
145 150 155 160 

Ser Asp Val Ser Ser Glu Gin Gin His Leu Lys Gly Gin Thr Gly Thr 
165 170 175 

Asn Gly Gly Asn Val Lys Leu Tyr Thr Thr Asp Gin Sex Thr Phe Asp 
180 185 190 

He Leu Gin Asp Leu Glu Phe Ser Ser Gly Ser Pro Gly Lys Glu Thr 
195 200 205 

Asn Glu Ser Pro Trp Arg Ser Asp Leu Leu He Asp Glu Asn Cys Leu 
210 215 220 

Leu Ser Pro Leu Ala Gly Glu Asp Asp Ser Phe Leu Leu Glu Gly Asn 
225 230 235 240 

Ser Asn Glu Asp Cys Lys Pro Leu He Leu Pro Asp Thr Lys Pro Lys 
245 250 255 

He Lys ASp Asn Gly Asp Leu Val Leu Ser Ser Pro Ser Asn Val Thr 
260 265 270 



Leu Pro Gin Val Lys Thr Glu Lys Glu Asp Phe He Glu Leu Cys Thr 
275 280 285 . 



Pro Gly Val He Lys Gin Glu Lys Leu Gly Thr Val Tyr Cys Gin Ala 
290 295 300 



Ser Phe Pro Gly Ala Asn He He Gly Asn Lys Met Ser Ala He Ser 
305 310 315 320 



Val His Gly Val Ser Thr Ser Gly Gly Gin Met Tyr His Tyr Asp Met 
325 330 335 



Asn Thr Ala Ser Leu Ser Gin Gin Gin Asp Gin Lys Pro He Phe Asn 
340 345 350 



Val He Pro Pro He Pro Val Gly Ser Glu Asn Trp Asn Arg Cys Gin 
355 360 365 



Gly Ser Gly Asp Asp Asn Leu Thr Ser Leu Gly Thr Leu Asn Phe Pro 
370 375 380 
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Gly Arg Thr Val Phe ser Asn Sly Ty r S er Ser Pro Ser Met Arg Pro 
390 395 400 

Asp val ser Ser Pro Pro Ser Ser Ser Ser Thr Ala Thr Thr Gly Pro 
405 41° 415 

Pro Pro Lys Leu Cys Leu Val Cys Ser Asp Glu Ala Ser Gly CyS His 

425 430 

Tyr Gly Val Leu Thr Cys Gly Ser Cys Lys Val Phe Phe Lys Arg Ala 

Val Glu Gly Gin „ is Asn , Leu Cys ^ Qly ftrg ^ ^ ^ 

455 450 



lie Asp Lys lie Arg Arg Lys Asn Cys Pro Ala Cys Arg Tyr Arg Lys 
470 475 4 | 0 

Cys Leu Gin Ala Gly Met Asn Leu Glu Ala Arg Lys Thr Lys Lys Lys 



495 



He Lys Gly lie Gin Gin Ala Thr Thr Gly Val Ser Gin Glu Thr Ser 
500 505 510 

Glu Asn Pro Gly Asn Lys Thr He val Pro Ala Thr Leu Pro Gin Leu 

520 525 

Thr Pro Thr Leu Val Ser Leu Leu Glu Val He Glu Pro Glu Val Leu 
5JU 535 540 

Tyr Ala Gly Tyr Asp Ser Ser Val Pro Asp Ser Thr Trp Arg He Met 



550 



555 5 6 o 



Thr Thr Leu Asn Met Leu Gly Gly Arg Gin Val He Ala Ala Val Lys 
565 570 5 7 5 

Trp Ala Lys Ala He Pro Gly Phe Arg Asn Leu His Leu Asp Asp Gin 

585 590 

Met Thr Leu Leu Gin Tyr Ser Trp Met Phe Leu Met Ma Phe Ma Leu 



600 



605 



Gly Trp A rg Ser Tyr Arg Gin Ser Ser Ala Asn Leu Leu Cys Phe Ala 

615 620 

Pro Asp Leu He He Asn Glu Gin Arg Met Thr Leu Pro Cys Met Tyr 
630 635 640 

Asp Gin Cys Lys His Met Leu Tyr Val Ser ser Glu Leu His Arg Leu 
645 650 65 5 

Gin Val Ser Tyr Glu Glu Tyr Leu Cys Met Lys Thr Leu Leu Leu Leu 
660 665 6 70 

Ser ser Val Pro Lys Asp Gly Leu Lys Ser Gin Glu Leu Phe Asp Glu 
675 680 685 
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He Arg Met Thr Tyr He Lys Glu Leu Gly Lys Ala He Val Lys Arg 
690 695 700 

Glu Gly Asn Ser Ser Gin Asn Trp Gin Arg Phe Tyr Gin Leu Thr Lys 
705 710 715 720 

Leu Leu Asp Ser Met His Glu Val Val Glu Asn Leu Leu Asn Tyr Cys 
725 730 735 

Phe Gin Thr Phe Leu Asp Lys Thr Met Ser He Glu Phe Pro Glu Met 
740 745 750 

Leu Ala Glu lie lie Thr Asn Gin He Pro Lys Tyr Ser Asn Gly Asn 
755 7 60 7 65 

He Lys Lys Leu Leu Phe His Gin Lys 
770 775 

<210> 3 

<211> 2334 

<212> DNA 

<213> Homo sapiens 



<220> 

<221> CDS 

<222> (1) . . (2334) 

<223> 

<400> 3 

atg gac tec aaa gaa tea tta act cct ggt aga gaa gaa aac ccc age 48 

Met Asp Ser Lys Glu Ser Leu Thr Pro Gly Arg Glu Glu Asn Pro Ser 

15 10 15 

agt gtg ctt get cag gag agg gga gat gtg atg gac ttc tat aaa ace 96 
Ser Val Leu Ala Gin Glu Arg Gly Asp Val Met Asp Phe Tyr Lys Thr 
20 25 30 

eta aga gga gga get act gtg aag gtt tct gcg tct tea ccc tea etg 144 
Leu Arg Gly Gly Ala Thr Val Lys Val Ser Ala Ser Ser Pro Ser Leu 
35 40 45 

get gtc get tct caa tea gac tec aag cag cga aga ctt ttg gtt gat 192 
Ala Val Ala Ser Gin Ser Asp Ser Lys Gin Arg Arg Leu Leu Val Asp 
50 55 60 

ttt cca aaa ggc tea gta age aat gcg cag cag cca gat etg tec aaa 240 
Phe Pro Lys Gly Ser Val Ser Asn Ala Gin Gin Pro Asp Leu Ser Lys 
65 70 75 80 

gca gtt tea etc tea atg gga etg tat atg gga gag aca gaa aca aaa 288 
Ala Val Ser Leu Ser Met Gly Leu Tyr Met Gly Glu Thr Glu Thr Lys 
85 90 95 

gtg atg gga aat gac etg gga ttc cca cag cag ggc caa ate age ctt 336 
Val Met Gly Asn Asp Leu Gly Phe Pro Gin Gin Gly Gin He Ser Leu 
100 105 110 

tec teg ggg gaa aca gac tta aag ctt ttg gaa gaa age att gca aac 384 
Ser Ser Gly Glu Thr Asp Leu Lys Leu Leu Glu Glu Ser He Ala Asn 
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415 



cct ccc aaa etc tgc ctg gtg tgc tct gat gaa get tea aaa tat M * 
Pro Pro Lys Leu Cys Leu Val Cys Ser Lp 2lu 22 Ser lly Cys His 



624 



672 



s s a a e is s s a c ss B k hi k k « 

iJ5 140 

si ?i? sii si s a c p« ^ 9ag r 9 ?? g tts cca aaa «* ««= « 0 

14 5 ?£ a Pr ° Thr Giu Glu Phe Pro Lys Thr His 

° 155 160 

s z s £ s s a a s a s is a s is s S!8 
s a; ss i;i tit s a $ s e 5 si = e s c »• 

185 190 

"i £2 2 k a a 2: s s ?s e = 5 H s 

200 205 

m a s s js 3 a k a a s 5 s s 5: a 

215 220 

L^u Sei p"i Liu SI G?v Glu Asd I"' ^° C " " 9 g3a aac ?20 

225 G ^ G1U ASP AS P Ser Phe Leu Glu Gly Asn 

230 "5 240 

a s s s a a s = s a » s s s s - ~ 

£ 5 S III S K S! E a E K s S £ E £ »« 

265 270 

a S SI E £ s s - - S s £ »j « » j- .« 

2 80 285 

s is e in a = » s a s e si si a s 812 

295 300 

ser Phe' Pro £J £J ^ ^ J" f at ?" «g tct gec att tet 

305 y LyS Met Ser Ala Ile s er 

315 320 

E SI 85 S3 S = X 11° £ SS SI? £ K « K «? - 

325 330 335 

s »• Hi s s = a a k a s s s s si 1056 

345 350 

e s 511 s s s e ss i~ a e s e s 5 si 



1104 



5 S SS E HI III ill S S III IS E III III III S 



1200 



370 375 — 38O 

Arc Thr III I? \ Ct t 3t 9gC tat tca ««c atg aga cca 

365 9 Val Phe ?" Asn G1 y T V r Ser Ser Pro Ser Met Arg Pro 

390 395 400 

in a si s s s: s s « s - s - - K s 
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420 



425 



430 



tat gga gtc tta act tgt gga age tgt aaa gtt ttc ttc aaa aga gca 
Tyr Gly val Leu Thr Cys Gly Ser Cys Lys Val Phe Phe Lys Arg Ala 
435 440 445 



1344 



gtg gaa gga cag cac aat tac eta tgt get gga agg aat gat tgc ate 
Val Glu Gly Gin His Asn Tyr Leu Cys Ala Gly Arg Asn Asp Cys lie 
450 455 460 



1392 



ate gat aaa att cga aga aaa aac tgc cca gca tgc cgc tat cga aaa 
lie Asp Lys lie Arg Arg Lys Asn Cys Pro Ala Cys Arg Tyr Arg Lys 
465 470 475 480 



1440 



tgt ctt cag get gga atg aac ctg gaa get cga aaa aca aag aaa aaa 
Cys Leu Gin Ala Gly Met Asn Leu Glu Ala Arg Lys Thr Lys Lys Lys 
485 490 495 



1488 



ata aaa gga att cag cag gee act aca gga gtc tea caa gaa ace tct 
lie Lys Gly He Gin Gin Ala Thr Thr Gly Val Ser Gin Glu Thr Ser 
500 505 510 



1536 



gaa aat cct ggt aac aaa aca ata gtt cct gca acg tta cca caa etc 
Glu Asn Pto Gly Asn Lys Thr He Val Pro Ala Thr Leu Pro Gin Leu 
515 520 525 



1584 



ace cct acc ctg gtg tea ctg ttg gag gtt att gaa cct gaa gtg tta 
Thr Pro Thr Leu Val Ser Leu Leu Glu Val He Glu Pro Glu Val Leu 
530 535 540 



1632 



tat gca gga tat gat age tct gtt cca gac tea act tgg aqq ate atg 
Tyr Ala Gly Tyr Asp Ser Ser Val Pro Asp Ser Thr Trp Arg He Met 
545 550 555 ' 560 



1680 



act acg etc aac atg tta gga ggg egg caa gtg att gca gca gtg aaa 
Thr Thr Leu Asn Met Leu Gly Gly Arg Gin Val lie Ala Ala Val Lys 
565 570 575 



1728 



tgg gca aag gca ata cca ggt ttc agg aac tta cac ctg gat gac caa 
Trp Ala Lys Ala He Pro Gly Phe Arg Asn Leu His Leu Asp Asp Gin 
580 585 590 



1776 



atg acc eta ctg cag tac tec tgg atg tec ctt atg gca ttt get ctg 
Met Thr Leu Leu Gin Tyr Ser Trp Met Ser Leu Met Ala Phe Ala Leu 
595 600 605 



1824 



ggg tgg aga tea tat aga caa tea agt gca aac ctg ctg tgt ttt get 
Gly Trp Arg Ser Tyr Arg Gin Ser Ser Ala Asn Leu Leu Cys Phe Ala 
610 615 620 



1872 



cct gat ctg att att aat gag cag aga atg act eta ccc tgc atg tac 
Pro Asp Leu He He Asn Glu Gin Arg Met Thr Leu Pro Cys Met Tyr 
625 630 635 640 



1920 



gac caa tgt aaa cac atg ctg tat gtt tec tct gag tta cac agg ctt 
Asp Gin Cys Lys His Met Leu Tyr Val Ser Ser Glu Leu His Arg Leu 
645 650 655 



1968 



cag gta tct tat gaa gag tat etc tgt atg aaa acc tta ctg ctt etc 
Gin Val Ser Tyr Glu Glu Tyr Leu Cys Met Lys Thr Leu Leu Leu Leu 
660 665 670 

tct tea gtt cct aag gac ggt ctg aag age caa gag eta ttt gat gaa 
Ser Ser Val Pro Lys Asp Gly Leu Lys Ser Gin Glu Leu Phe Asp Glu 
675 680 685 



2016 



2064 



att aga atg acc tac ate aaa gag eta gga aaa gee att gtc aag agg 
He Arg Met Thr Tyr He Lys Glu Leu Gly Lys Ala He Val Lys Arg 
690 695 700 



2112 



gaa gga aac tec age cag aac tgg cag egg ttt tat caa ctg aca aaa 
Glu Gly Asn Ser Ser Gin Asn Trp Gin Arg Phe Tyr Gin Leu Thr Lys 
705 710 715 720 



2160 
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etc ttg gat tct atg cat gaa gtg gtt gaa aat etc ctt aac t*r ^ 
Leu Leu Asp Ser Met His Glu Val Val 2iu Asn tl III III £ 

730 735 

ss s s js a s w = s» a « K e is a k 

S S K s s s s a s S S i" £ 2 s s 

ate aaa aaa ctt ctg ttt cat caa aag tga 
He Lys Lys Leu Leu Phe His Gin Lvs 
77 0 775 



2208 



2256 



2304 



2334 



<210> 4 

<211> 777 

<212> PRT 

<213> Homo sapiens 

<400> 4 



Met Asp ser Lys Glu Ser Leu Thr Pro Gly Arg Glu Glu Asn Pro 



10 



Ser 



IS 



Ser Val Leu Ala Gin Glu Arg Gly Asp Val 



20 



Met Asp Phe Tyr Lys Thr 
25 30 



Leu Arg Gly Gly Ala Thr Val Lys Val Ser Ala Ser Ser Pro Ser Leu 



40 



45 



Ala Val Ala Ser Gin Ser Asp Ser Lys Gin Arg Arg Leu Leu Val Asp 



60 



Phe Pro Lys Gly Ser Val Ser Asn Ala Gin Gin Pro Asp Leu Ser Lys 



75 



80 



Ala Val Ser Leu Ser Met Gly Leu Tyr Met Gly Glu Thr Glu Thr 
85 



90 



95 



Lys 



Val Met Gly Asn Asp Leu Gly Phe Pro Gin Gin Gly Gin He Se 



105 



r Leu 



110 



Ser Ser Gly Glu Thr Asp Leu Lys Leu Leu Glu Glu Ser lie Ala Asn 
AA:> 120 



125 



Leu Asn Arg Ser Thr Ser Val Pro Glu Asn Pro Lys Ser Ser Ala Ser 



140 



Thr Ala val Ser Ala Ala Pro Thr Glu Lys Glu Phe Pro Lys Thr His 



150 



155 



160 



Ser Asp val Ser Ser Glu Gin Gin His Leu Lys Gly Gin Thr 



165 



170 



Gly Thr 
175 



Asn Gly Gly Asn Val Lys Leu Tyr Thr Thr Asp Gin Ser Thr Phe 



180 



185 



190 



Asp 



MSDOCID: <WO__03015692A2J_> 



WO 03/015692 PCT/US02/22648 

11/57 



lie Leu Gin Asp Leu Glu Phe Ser Sec Gly Sec Pro Gly Lys Glu The 
195 200 205 



Asn Glu Ser Pro Trp Arg Ser Asp Leu Leu lie Asp Glu Asn Cys Leu 
210 215 220 



Leu Ser Pro Leu Ala Gly Glu Asp Asp ser phe Leu Leu Glu Gly Asn 
225 230 235 240 



Ser Asn Glu Asp Cys Lys Pro Leu He Leu Pro Asp Thr Lys Pro Lys 
245 250 " 255 



He Lys Asp Asn Gly Asp Leu Val Leu Ser Ser Pro Ser Asn Val Thr 
260 265 270 



Leu Pro Gin val Lys Thr Glu Lys Glu Asp Phe He Glu Leu Cys Thr 
275 280 285 



Pro Gly val He Lys Gin Glu Lys Leu Gly Thr Val Tyr Cys Gin Ala 
290 295 300 



Ser Phe Pro Gly Ala Asn He He Gly Asn Lys Met Ser Ala lie Ser 
305 310 315 320 



Val His Gly Val Ser Thr Ser Gly Gly Gin Met Tyr His Tyr Asp Met 
325 330 335 



Asn Thr Ala Ser Leu Ser Gin Gin Gin Asp Gin Lys Pro He Phe Asn 
340 345 350 



Val He Pro Pro He Pro Val Gly Ser Glu Asn Trp Asn Arg Cys Gin 
355 360 365 



Gly Ser Gly Asp Asp Asn Leu Thr Ser Leu Gly Thr Leu Asn Phe Pro 
370 375 380 



Gly Arg Thr Val Phe Ser Asn Gly Tyr Ser Ser Pro Ser Met Arg Pro 
385 390 395 400 



Asp Val ser Ser Pro Pro Ser Ser Ser Ser Thr Ala Thr Thr Gly Pro 
405 410 415 



Pro Pro Lys Leu Cys Leu Val Cys Ser Asp Glu Ala Ser Gly Cys His 
420 425 430 



Tyr Gly val Leu Thr Cys Gly Ser Cys Lys Val Phe Phe Lys Arg Ala 
435 440 445 



Val Glu Gly Gin His Asn Tyr Leu Cys Ala Gly Arg Asn Asp Cys He 
450 455 460 



He Asp Lys He Arg Arg Lys Asn Cys Pro Ala Cys Arg Tyr Arg Lys 
465 470 475 480 



Cys Leu Gin Ala Gly Met Asn Leu Glu Ala Arg Lys Thr Lys Lys Lys 
485 490 495 
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He Lys Gly lie Gin Gin Ala Thr Thr Gly val Ser Gin Glu Thr Ser 
500 5 °5 510 

Glu Asn Pro Gly Asn Lys Thr lie Val Pro Ala Thr Leu Pro Gin Leu 
515 520 525 

Thr Pro Thr Leu Val Ser Leu Leu Glu val He Glu Pro Glu Val Leu 

535 

Tyr Ala Gly Tyr Asp Ser Ser Val Pro Asp Ser Thr Tr P Arg lie Met 
^° 555 560 

Thr Thr Leu Asn Met Leu Gly Gly Arg Gin Val He Ala Ala Val Lys 
565 570 575 

Tr P Ala Lys Ala He Pro Gly Phe Arg Asn Leu His Leu Asp Asp Gin 

595 

Met Thr Leu Leu Gin Tyr Ser Trp Met Ser Leu Met Ala Phe Ala Leu 
595 600 605 

Gly Trp Arg Ser Tyr Arg Gin Ser Ser Ala Asn Leu Leu Cys Phe Ala 
±KJ 615 6 20 

Pro Asp Leu lie He Asn Glu Gin Arg Met Thr Leu Pro Cys Met Tyr 
630 635 

Asp Gin Cys Lys His Met Leu Tyr Val Ser Ser Glu Leu His Arg Leu 
645 650 655 

Gin Val Ser Tyr Glu Glu Tyr Leu Cys Met Lys Thr Leu Leu Leu Leu 

665 670 

Ser Ser Val Pro Lys Asp Gly Leu Lys Ser Gin Glu Leu Phe Asp Glu 

680 685 

He Arg Met Thr Tyr He Lys Glu Leu Gly Lys Ala He Val Lys Arg 

Glu Gly Asn Ser Ser Gin Asn Trp Gin Arg Phe Tyr Gin Leu Thr Lys 
710 715 720 

Leu Leu Asp Ser Met His Glu Val Val Glu Asn Leu Leu Asn Tyr Cys 
725 730 735 

Phe Gin Thr Phe Leu Asp Lys Thr Met Ser lie Glu Phe Pro Glu Met 
U 7 45 750 

Leu Ala Glu He He Thr Asn Gin He Pro Lys Tyr Ser Asn Gly Asn 
755 760 7 6 5 

He Lys Lys Leu Leu Phe His Gin Lys 
770 775 
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<211> 2334 

<212> DNA 

<213> Homo sapiens 

<220> 

<221> CDS 

<222> (1)..(2334) 

<223> 



<400> 5 

atg gac tec aaa gaa tea tta act cct ggt aga gaa gaa aac ccc age 48 

Met Asp Ser Lys Glu Ser Leu Thr Pro Gly Arg Glu Glu Asn Pro Ser 

1 5 10 15 



agt gtg ctt get cag gag agg gga gat gtg atg gac ttc tat aaa acc 
Ser val Leu Ala Gin Glu Arg Gly Asp val Met Asp Phe Tyr Lys Thr 
20 25 30 



96 



eta aga gga gga get act gtg aag gtt tct gcg tct tea ccc tea ctg 144 

Leu Arg Gly Gly Ala Thr Val Lys Val Ser Ala Ser Ser Pro Ser Leu 
35 40 45 

get gtc get tct caa tea gac tec aag cag cga aga ctt ttg gtt gat 192 

Ala Val Ala Ser Gin Ser Asp Ser Lys Gin Arg Arg Leu Leu Val Asp 

50 55 60 

ttt cca aaa ggc tea gta age aat gcg cag cag cca gat ctg tec aaa 240 

Phe Pro Lys Gly Ser Val Ser Asn Ala Gin Gin Pro Asp Leu Ser Lys 

6 5 70 75 80 

gca gtt tea etc tea atg gga ctg tat atg gga gag aca gaa aca aaa 288 

Ala Val Ser Leu Ser Met Gly Leu Tyr Met Gly Glu Thr Glu Thr Lys 
85 90 " 95 

gtg atg gga aat gac ctg gga ttc cca cag cag ggc caa ate age ctt 336 

Val Met Gly Asn Asp Leu Gly Phe Pro Gin Gin Gly Gin lie Ser Leu 
100 105 110 

tec teg ggg gaa aca gac tta aag ctt ttg gaa gaa age att gca aac 384 

Ser Ser Gly Glu Thr Asp Leu Lys Leu Leu Glu Glu Ser He Ala Asn 
115 120 125 

etc aat agg teg acc agt gtt cca gag aac ccc aag agt tea gca tec 432 

Leu Asn Atg Ser Thr Ser Val Pro Glu Asn Pro Lys Ser Ser Ala Sex 

130 135 140 

act get gtg tct get gee ccc aca gag aag gag ttt cca aaa act cac 4 80 

Thr Ala val Ser Ala Ala Pro Thr Glu Lys Glu Phe Pro Lys Thr His 

1*5 150 155 160 

tct gat gta tct tea gaa cag caa cat ttg aag ggc cag act ggc acc 528 

Ser Asp Val Ser Ser Glu Gin Gin His Leu Lys Gly Gin Thr Gly Thr 
165 170 175 

aac ggt ggc aat gtg aaa ttg tat acc aca gac caa age acc ttt gac 576 

Asn Gly Gly Asn Val Lys Leu Tyr Thr Thr Asp Gin Ser Thr Phe Asp 
180 185 190 

att ttg cag gat ttg gag ttt tct tct ggg tec cca ggt aaa gag acg 624 

He Leu Gin Asp Leu Glu Phe Ser Ser Gly Ser Pro Gly Lys Glu Thr 
195 200 205 

aat gag agt cct tgg aga tea gac ctg ttg ata gat gaa aac tgt ttg 672 

Asn Glu Ser Pro Trp Arg Ser Asp Leu Leu He Asp Glu Asn Cys Leu 

210 215 220 

ctt tct cct ctg gcg gga gaa gac gat tea ttc ctt ttg gaa gga aac 720 
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Leu Ser Pro Leu Ala Cly Giu Asp Asp ser Phe Leu Leu ^ ^ ^ 

235 240 
teg aat gag gac tgc aag cct etc att tta cm n ^ 

s er Asn Giu ASP c y s Lys Pro Leu tit it: s fs a p T a „< £ ^ aaa 768 

250 255 

J2 E K E E E E K a £ - - s - s ~ • ... 

265 270 

a s s a s: s K « s K ~ K * * - ... 
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E S ffi S3 K 5 £ S S a S a E £ 5 S - 
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- s s a s g = 5 = s a e e s s e - 

345 350 

vtl ill Pro Pro 11" 52 III lit ITu a*' J™ 399 t9C Caa 1104 
355 • LS Fr ° Vai s " Glu Asn Trp Asn Arg Cys Gin 

365 

e 2 a; e s s a e e a a; s a « s « - 

375 380 

SJ E E !2 s k e is a - - - = a s s -0 
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E E E K S S E IS E S s K S £ E E 



1248 



415 



5 s a e m a 53 a 2 e c - e 5 a e »■ 

425 430 

E « s a s e s; ;s e s c e s s 5 s 

440 4<J5 

» E 1?; 3 E S S E E E E S E 5 E S 13,2 

455 460 

E E E ffi S S K K S S S S 5 S S S 
S S 2 E E K £ a E E E E E 3 E E 



485 4 9 0 

s s s s a a s e s k s s a e s 



gaa aat cct ggt aac aaa aca ata gtt cct gca acq tta cca caa ri-c 
Giu Asn Pro Gly Asn Lys Thr He Val Pro ll a T h? lt* u Trt Gin Leu 
515 5 2 o 525 

acc cct acc ctg gtg tea ctg ttg gag gtt att gaa cct gaa gtg tta 



1584 



1632 
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Thr Pro Thr Leu Val Ser Leu Leu Glu Val He Glu Pro Glu Val Leu 

530 535 540 

tat gca gga tat gat age tct gtt cca gac tea act tgg agg ate atg 1680 

Tyr Ala Gly Tyr Asp Ser Ser Val Pro Asp Ser Thr Trp Arg He Met 

545 550 555 560 

act acg etc aac atg tta gga ggg egg caa gtg att gca gca gtg aaa 1728 

Thr Thr Leu Asn Met Leu Gly Gly Arg Gin Val He Ala Ala Val Lys 

565 570 575 

tgg gca aag gca ata cca ggt ttc agg aac tta cac ctg gat gac caa 1776 

Trp Ala Lys Ala He Pro Gly Phe Arg Asn Leu His Leu Asp Asp Gin 

580 585 590 

atg acc eta ctg cag tac tec tgg .*tg gac ctt atg gca ttt get ctg 1824 

Met Thr Leu Leu Gin Tyr Ser Trp Met Asp Leu Met Ala Phe Ala Leu 

595 600 605 

ggg tgg aga tea tat aga caa tea agt gca aac ctg ctg tgt ttt get 1872 

Gly Trp Arg Ser Tyr Arg Gin Ser Ser Ala Asn Leu Leu Cys Phe Ala 

610 615 620 



cct gat ctg att att aat gag cag aga atg act eta ccc tgc atg tac 
Pro Asp Leu He He Asn Glu Gin Arg Met Thr Leu Pro Cys Met Tyr 
625 630 635 640 



1920 



gac caa tgt aaa cac atg ctg tat gtt tec tct gag tta cac agg ctt 1968 
Asp Gin Cys Lys His Met Leu Tyr Val Ser Ser Glu Leu His Arg Leu , 
645 650 655 

cag gta tct tat gaa gag tat etc tgt atg aaa acc tta ctg ctt etc 2016 
Gin Val ser Tyr Glu Glu Tyr Leu Cys Met Lys Thr Leu Leu Leu Leu 
660 665 670 

tct tea gtt cct aag gac ggt ctg aag age caa gag eta ttt gat gaa 2064 
Ser Ser Val Pro Lys Asp Gly Leu Lys ser Gin Glu Leu Phe Asp Glu 
675 680 685 

att aga atg acc tac ate aaa gag eta gga aaa gee att gtc aag agg 2112 
He Arg Met Thr Tyr He Lys Glu Leu Gly Lys Ala lie Val Lys Arg 
690 695 700 

gaa gga aac tec age cag aac tgg cag egg ttt tat caa ctg aca aaa 2160 
Glu Gly Asn Ser Ser Gin Asn Trp Gin Arg Phe Tyr Gin Leu Thr Lys 
705 710 715 720 

etc ttg gat tct atg cat gaa gtg gtt gaa aat etc ctt aac tat tgc 2208 
Leu Leu Asp Ser Met His Glu Val Val Glu Asn Leu Leu Asn Tyr Cys 
725 730 735 

ttc caa aca ttt ttg gat aag acc atg agt att gaa ttc ccc gag atg 2256 
Phe Gin Tnr Phe Leu Asp Lys Thr Met Ser He Glu Phe Pro Glu Met 
740 745 750 

tta get gaa ate ate acc aat cag ata cca aaa tat tea aat gga aat 2304 
Leu Ala Glu He He Thr Asn Gin He Pro Lys Tyr Ser Asn Gly Asn 
755 760 765 

ate aaa aaa ctt ctg ttt cat caa aag tga 2334 
He Lys Lys Leu Leu Phe His Gin Lys 
770 775 

<210> 6 

<211> 777 

<212> PRT 

<213> Homo sapiens 



<400> 6 
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Met Asp ser Ly S Glu Ser Leu Tht Pro Gly Arg Glu Glu Asn 
5 10 



Pro Ser 
15 



Ser val Leu Ala Gin Glu Arg Gly Asp Val Met Asp phe Tyr Lys Thr 

25 3Q 

Leu Arg Gly Gly Ala Th r Val Lys Val Ser Ala Ser Ser Pro Ser Leu 
J:> 40 45 



Ala val Ala Ser Gin Ser Asp Ser Lys Gin Arg Arg Leu Leu Val Asp 

55 60 

Phe Pro Lys Gly Ser Val Ser Asn Ala Gin Gin Pro A,p Leu Ser Lys 

75 80 

Ala Val ser Leu Ser Met Gly Leu Tyr Met Gly Glu Thr Glu Thr Lys 
85 9° 95 

val Met Gly Asn Asp Leu Gly Phe Pro Gin Gin Gly Gin lie Ser Leu 
Ser ser Gly Glu Thr Asp Leu Lys Leu Leu Glu Glu Ser He Ala Asn 



125 



Leu Asn Arg Ser Thr Ser Val Pro Glu Asn Pro Lys Ser Ser Ala Ser 

135 140 

Thr Ala Val Ser Ala Ala Pro Thr Glu Lys Glu Phe Pro Lys Thr His 



155 



160 



Ser Asp val Ser Ser Glu Gin Gin His Leu Lys Gly Gin Thr Gly Thr 
165 175 

Asn Gly Gly Asn Val Lys Leu Tyr Thr Thr Asp Gin Ser Thr Phe Asp 

185 i9o 

He Leu Gin Asp Leu Glu Phe Ser Ser Gly Ser Pro Gly Lys Glu Thr 
135 2 °0 205 

Asn Glu ser Pro Trp Arg ser Asp Leu Leu n e Asp Glu Asn Cys Leu 

215 220 

Leu ser Pro Leu Ala Gly Glu Asp Asp Ser Phe Leu Leu Glu Gly Asn 



235 



240 



Ser Asn Glu Asp Cys Lys Pro Leu He Leu Pro Asp Thr Lys Pro Lys 
245 250 255 

He Lys Asp Asn Gly Asp Leu Val Leu Ser Ser Pro Ser Asn Val Thr 
^ b0 265 270 

Leu Pro Gin Val Lys Thr Glu Lys Glu Asp Phe lie Glu Leu Cys Thr 

280 285 



Pro Gly val lie Lys Gin Glu Lys Leu Gly Thr Val Tyr Cys Gin 



300 



Ala 
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Ser Phe Pro Gly Ala Asn lie lie Gly Asn Lys Met Ser Ala lie See 
305 310 315 320 



Val His Gly Val Ser Thr Ser Gly Gly Gin Met Tyr His Tyr Asp Met 
325 330 335 



Asn Thr Ala Ser Leu Ser Gin Gin Gin Asp Gin Lys Pro He Phe Asn 
340 345 350 



Val lie Pro Pro He Pro Val Gly Ser Glu Asn Trp Asn Arg Cys Gin 
355 360 365 



Gly Ser Gly Asp Asp Asn Leu Thr Ser Leu Gly Thr Leu Asn Phe Pro 
370 375 380 



Gly Arg Thr Val Phe Ser Asn Gly Tyr Ser Ser Pro Ser Met Arg Pro 
385 390 395 400 



Asp Val ser Ser Pro Pro Ser Ser Ser Ser Thr Ala Thr Thr Gly Pro 
405 410 415 



Pro Pro Lys Leu cys Leu Val Cys Ser Asp Glu Ala Ser Gly Cys His 
420 425 430 



Tyr Gly val Leu Thr Cys Gly Ser Cys Lys Val Phe Phe Lys Arg Ala 
435 440 445 



Val Glu Gly Gin His Asn Tyr Leu Cys Ala* Gly Arg Asn Asp Cys He 
450 455 460 



lie Asp Lys He Arg Arg Lys Asn Cys Pro Ala Cys Arg Tyr Arg Lys 
465 470 475 480 



Cys Leu Gin Ala Gly Met Asn Leu Glu Ala Arg Lys Thr Lys Lys Lys 
485 490 495 



He Lys Gly He Gin Gin Ala Thr Thr Gly Val Ser Gin Glu Thr Ser 
500 505 510 



Glu Asn Pro Gly Asn Lys Thr He Val Pro Ala Thr Leu Pro Gin Leu 
515 520 525 



Thr Pro Thr Leu Val Ser Leu Leu Glu Val He Glu Pro Glu Val Leu 
530 535 540 



Tyr Ala Gly Tyr Asp Ser Ser Val Pro Asp Ser Thr Trp Arg He Met 
545 550 555 560 



Thr Thr Leu Asn Met Leu Gly Gly Arg Gin Val He Ala Ala Val Lys 
565 570 575 



Trp Ala Lys Ala He Pro Gly Phe Arg Asn Leu His Leu Asp Asp Gin 
580 585 590 



Met Thr Leu Leu Gin Tyr Ser Trp Met Asp Leu Met Ala Phe Ala Leu 
595 600 605 
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Gly rrp Ar g Ser Tyr Ar g Gin Ser Ser Ala Asn Leu Leu Cys Phe Ala 

615 620 

Pro Asp Leu He He Asn Glu Gin Arg Met Thr Leu Pro Cys Met Tyr 

635 640 

Asp Gin cys Lys His Met Leu Tyr Val Ser Ser Glu Leu His Arg Leu 

650 655 
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Gin Val ser Tyr Glu Glu Tyr Leu Cys Met Lys Thr Leu Leu Leu Leu 

665 670 

Ser ser Val Pro Lys Asp Gly Leu Lys Ser Gin Glu Leu Phe Asp Glu 

680 685 



He Arg Met Thr Tyr He Lys Glu Leu Gly Lys Ala He Val Lys Arg 

Glu Gly Asn Ser Ser Gin Asn Trp Gin Arg Phe Tyr Gin Leu Thr Lys 
710 7" 720 



Leu Leu Asp Ser Met His Glu Val Val Glu Asn Leu Leu Asn Tyr Cys 
740 



730 -, 35 
Phe Gin Thr Phe leu Asp Lys Thr Met Ser He 6 lu Phe Pro Glu Met 



Leu Ala Glu lie He Thr Asn Gin He Pro Lys Tyr Ser Asn Gly Asn 
705 760 



765 



He Lys Lys Leu Leu Phe His Gin Lys 
770 775 y 



<210> 7 

<211> 2334 
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<220> 
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<223> 

<220> 
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atg gac tec aaa gaa tea tta act cct ggt aga gaa gaa aac ccc age 48 
Met Asp Ser Lys Glu Ser Leu Thr Pro Gly Arg Glu Glu Asn Pro Ser 

15 10 15 

agt gtg ctt get cag gag agg gga gat gtg atg gac ttc tat aaa ace 96 

Ser Val Leu Ala Gin Glu Arg Gly Asp Val Met Asp Phe Tyr Lys Thr 
20 25 30 

eta aga gga gga get act gtg aag gtt tct gcg tct tea ccc tea ctg 144 

Leu Arg Gly Gly Ala Thr Val Lys Val Ser Ala Ser Ser Pro Ser Leu 
35 40 45 

get gtc get tct caa tea gac tec aag cag cga aga ctt ttg gtt gat 192 

Ala Val Ala Ser Gin Ser Asp Ser Lys Gin Arg Arg Leu Leu Val Asp 
50 55 60 

ttt cca aaa ggc tea gta age aat gcg cag cag cca gat ctg tec aaa 240 

Phe Pro Lys Gly Ser Val Ser Asn Ala Gin Gin Pro Asp Leu Ser Lys 
65 70 75 80 

gca gtt tea etc tea atg gga ctg tat atg gga gag aca gaa aca aaa 288 

Ala Val Ser Leu Ser Met Gly Leu Tyr Met Gly Glu Thr Glu Thr Lys 

85 90 95 

gtg atg gga aat gac ctg gga ttc cca cag cag ggc caa ate age ctt 336 

Val Met Gly Asn Asp Leu Gly Phe Pro Gin Gin Gly Gin lie Ser Leu 
100 105 110 

tec teg ggg gaa aca gac tta aag ctt ttg gaa gaa age att gca aac 384 

Ser Ser Gly Glu Thr Asp Leu Lys Leu Leu Glu Glu Ser lie Ala Asn 
115 120 125 

etc aat agg teg acc agt gtt cca gag aac ccc aag agt tea gca tec 432 

Leu Asn Arg Ser Thr Ser Val Pro Glu Asn Pro Lys Ser Ser Ala Ser 
130 135 140 



act get gtg tct get gee ccc aca gag aag gag ttt cca aaa act cac 
Thr Ala Val Ser Ala Ala Pro Thr Glu Lys Glu Phe Pro Lys Thr His 
145 150 155 160 



480 



tct gat gta tct tea gaa cag caa cat ttg aag ggc cag act ggc acc 528 

Ser Asp Val Ser Ser Glu Gin Gin His Leu Lys Gly Gin Thr Gly Thr 

165 170 ' 175 

aac ggt ggc aat gtg aaa ttg tat acc aca gac caa age acc ttt gac 576 

Asn Gly Gly Asn Val Lys Leu Tyr Thr Thr Asp Gin Ser Thr Phe Asp 
180 185 190 

att ttg cag gat ttg gag ttt tct tct ggg tec cca ggt aaa gag acg 624 

lie Leu Gin Asp Leu Glu Phe Ser Ser Gly Ser Pro Gly Lys Glu Thr 
195 200 205 

aat gag agt cct tgg aga tea gac ctg ttg ata gat gaa aac tgt ttg 6*72 

Asn Glu Ser Pro Trp Arg Ser Asp Leu Leu lie Asp Glu Asn Cys Leu 
210 215 220 

ctt tct cct ctg gcg gga gaa gac gat tea ttc ctt ttg gaa gga aac 720 

Leu Ser Pro Leu Ala Gly Glu Asp Asp Ser Phe Leu Leu Glu Gly Asn 
225 230 235 240 

teg aat gag gac tgc aag cct etc att tta ccg gac act aaa ccc aaa 768 

Ser Asn Glu Asp Cys Lys Pro Leu lie Leu Pro Asp Thr Lys Pro Lys 

245 250 255 

att aag gat aat gga gat ctg gtt ttg tea age ccc agt aat gta aca 816 

He Lys Asp Asn Gly Asp Leu Val Leu Ser Ser Pro Ser Asn Val Thr 
260 265 270 

ctg ccc caa gtg aaa aca gaa aaa gaa gat ttc ate gaa etc tgc acc 864 

Leu Pro Gin Val Lys Thr Glu Lys Glu Asp Phe He Glu Leu Cys Thr 
275 280 285 

cct ggg gta att aag caa gag aaa ctg ggc aca gtt tac tgt cag gca 912 

Pro Gly val He Lys Gin Glu Lys Leu Gly Thr Val Tyr Cys Gin Ala 
290 295 300 
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310 315 320 

s K » e s s s; sg a a £ e e e k 

330 335 

5 E E E E E E E S 3 E E 'I.' E E 

345 350 
att cca cca att ccc gtt qot tcr naa * 

He Pro Pro He Pro Val riw It 2f * t99 aat ag9 tgc caa 
355 ?J Ser Glu Asn Trp Asn Arg Cys Gin 

360 365 

E 2 K E E E E £ £ E E E E E E E 

375 380 



age 
Ser 
305 

gtt 

Val 



aat 
Asn 



gtc 
Val 



ggt 

Gly 
38S 

gat 
Asp 



cct 
Pro 



tat 
Tyr 



E E S E E E E E E E E E E E E 

395 400 

E E S E E E E E E E E E E E E 
EEEEEEEEEEEEEEE 

425 430 

E ffi E E E E E E E E E E E E E 

**' a = 5 " « a -* s k s; e ?! e e e 

4i>5 46Q 

S 5 E E E E E E E E S E E E E E 

475 480 

Cys Leu S2 lit Ity Me? A« £ Ifu 22 ?" *" *" 

oxy wet Asn Leu Glu Ala Arg Lys Thr Lys Lys Lys 

490 4 g 5 

J" E E E E E K E E E K E E E E E 

505 510 

E E E E E E E E E E E E E E E E 

520 525 

E E E E S E E E E = E E E E E E 

535 540 



tat gca gga tat gat age tct 
Tyr Ala Gly Tyr Asp Ser Ser 
545 550 

act acg etc aac atg tta gga 
Thr Thr Leu Asn Met Leu Gly 
565 

tgg gca aag gca ata cca ggt 
Trp Ala Lys Ala lie Pro Gly 
580 



nnn cca gac tea act nnn agg ate atg 
Xaa Pro Asp Ser Thr Xaa Arg n e Met 
5 55 560 

ggg egg caa gtg att gca gca gtg aaa 
Gly Arg Gin Val He Ala Ala Val Lys 

ttc agg aac tta cac ctg gat gac caa 
Phe Arg Asn Leu His Leu Asp Asp Gin 
585 590 



Me? rtl fo f 9 °t 9 t3C tC ° t99 atg nnn ctt *tg gca ttt get ctg 
Met Thr Leu Leu Gin Tyr Ser Trp Met Xaa Leu Me? Ala Phe Ala Leu 

595 600 605 



960 



1008 



1056 



1104 



1152 



1200 



1248 



1296 



1344 



1392 



1440 



1488 



1536 



1584 



1632 



1680 



1728 



1776 



1824 
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ggg tgg aga tea tat aga caa tea agt gca aac ctg ctg tgt ttt get 1872 

Gly Trp Arg Ser Tyr Arg Gin Ser Ser Ala Asn Leu Leu Cys Phe Ala 
610 615 620 

cct gat ctg att att aat gag cag aga atg act nnn ccc nnn atg tac 1920 

Pro Asp Leu lie lie Asn Glu Gin Arg Met Thr Xaa Pro Xaa Met Tyr 
625 630 " 635 640 

gac caa tgt aaa cac atg ctg nnn gtt tec tct gag tta cac agg ctt 1968 

Asp Gin Cys Lys His Met Leu Xaa Val Ser Ser Glu Leu His Arg Leu 
645 650 655 

cag gta tct nnn gaa gag tat etc tgt atg aaa acc tta ctg ctt etc 2016 

Gin Val Ser Xaa Glu Glu Tyr Leu Cys Met Lys Thr Leu Leu Leu Leu 
660 665 670 

tct tea gtt cct aag gac ggt ctg aag age caa gag nnn ttt gat gaa 2064 

Ser Ser Val Pro Lys Asp Gly Leu Lys Ser Gin Glu Xaa Phe Asp Glu 
675 680 685 

att aga nnn acc tac ate aaa gag eta gga aaa gec att nnn aag agg 2112 

lie Arg xaa Thr Tyr He Lys Glu Leu Gly Lys Ala He Xaa Lys Arg 
690 695 700 

gaa gga aac tec age cag aac nnn cag egg ttt tat caa ctg aca aaa 2160 

Glu Gly Asn Ser Ser Gin Asn Xaa Gin Arg Phe Tyr Gin Leu Thr Lys 
705 710 715 720 

etc ttg gat tct atg cat gaa gtg gtt gaa aat etc nnn aac tat tgc 2208 

Leu Leu Asp Ser Met His Glu Val Val Glu Asn Leu Xaa Asn Tyr Cys 
725 730 735 

ttc caa aca ttt nnn gat aag acc atg agt att gaa ttc ccc gag atg 2256 

Phe Gin Thr Phe Xaa Asp Lys Thr Met Ser He Glu Phe Pro Glu Met 
740 745 750 

tta get gaa ate ate acc aat cag ata cca aaa nnn tea aat gga aat 2304 

Leu Ala Glu He He Thr Asn Gin He Pro Lys Xaa Ser Asn Gly Asn 
755 760 765 

ate aaa aaa ctt ctg ttt cat caa aag tga 2334 

He Lys Lys Leu Leu Phe His Gin Lys 
770 775 

<210> 8 

<211> 777 

<212> PRT 

<213> Homo sapiens 



<220> 

<221> misc_feature 
<222> (535) . . (535) 

<223> The 'Xaa' at location 535 stands for Cys, Asn, Tyr, Lys, Ser, Asp, 
Glu, Gin, Arg or Thr. 

<220> 

<221> misc_feature 
<222> (538) . . (538) 

<223> The 'Xaa' at location 538 stands for Cys, Asn, Tyr, Lys, Ser, Asp, 
Glu, Gin, Arg or Thr. 
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<220> 

<221> misc_feature 
<222> (552) . . (552) 

^ r l0Cati ° n 552 St - dS f ° r Asn, Tyr, Lys , Ser, Asp, 

<220> 

<221> misc_feature 
<222> (557) . . (557) 

Zl*\™\r?t'r ?L' OCati ° n StandS f ° r C * S < T vr, Lys , Ser, Asp, 

<220> 

<22l> misc__f eature 
<222> (602) . . (602) 

Glu^Gln^Ar?:; T^r 10 ^ 00 ^ C ^ ^> As p, 

<220> 

<221> misc_feature 
<222> (636) . . (636) 

cll^^Tor ?L l0Cati ° n " 6 StandS ^ CyS ' *-< T ^< I*-- ^r, A sp, 
<220> 

<221> misc_feature 
<222> (638).. (638) 

tll%i™\'r?or rur 10 ^ "* ^ ^ As p, 

<220> 

<221> misc_f eature 
<222> (648) . . (648) 

Giu 23> G1 r A ;^; ^ r iocation 648 « *« ^ Lys , ser , ASP , 

<220> 

<221> misc_f eature 

<222> (660) . . (660) 

llTciT^Tor XL 10 "" 0 " StandS f ° r CyS ' -VS, Ser, Asp, 

<220> 

<221> raisc_feature 

<222> (685) . . (685) 
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<223> The 'Xaa' at location 685 stands for Cys, Asn, Tyr, Lys, Ser, Asp, 
Glu, Gin, Arg or Thr. 

<220> 

<221> misc_feature 
<222> (691) . . (691) 

<223> The 'Xaa* at location 691 stands for Cys, Asn, Tyr, Lys, Ser, Asp, 
Glu, Gin, Arg or Thr. 

<220> 

<221> misc_feature 
<222> (702) . . (702) 

<223> The 'Xaa' at location 702 stands for Cys, Asn, Tyr, Lys, Ser, Asp, 
Glu, Gin, Arg or Thr. 

<220> 

<221> misc_feature 
<222> (712) . . (712) 

<223> The 'Xaa' at location 712 stands for Cys, Asn, Tyr, Lys, Ser, Asp, 
Glu, Gin, Arg or Thr. 

<220> 

<221> miscfeature 
<222> (733) . . (733) 

<223> The 'Xaa' at location 733 stands for Cys, Asn, Tyr, Lys, Ser, Asp, 
Glu, Gin, Arg or Thr. 

<220> 

<221> misc_feature 
<222> (741) . . (741) 

<223> The 'Xaa' at location 741 stands for Cys, Asn, Tyr, Lys, Ser, Asp, 
Glu, Gin, Arg or Thr. 

<220> 

<221> misc_feature 
<222> (764) . . (764) 

<223> The 'Xaa' at location 764 stands for Cys, Asn, Tyr, Lys, Ser, Asp, 
Glu, Gin, Arg or Thr. 

<220> 

<221> misc_feature 

<222> (1)..(2334) 

<223> n - a or c or g or t/u 

<400> 8 

Met Asp Ser Lys Glu Ser Leu Thr Pro Gly Arg Glu Glu Asn Pro Ser 
1 5 10 15 
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Ser Val Leu Ala Gl n Glu Arc Gly A Val ^ ftsp ^ ^ 

25 30 

Leu Arg Gly Gly Ala Thr Val Lys Val Ser Al* Q e , c « 

35 .J ai ber Ala Ser Ser Pro Ser Leu 

Ala Val Ala Ser Gln S er Asp Ser Lys Gin Arg Arg Lfiu Leu 

55 60 

Phe Pro L ys G l y ser Val ser Asn Ala Gin Gln Pro Asp Leu Ser Lys 



75 



80 



Ma v.l ser Leu Ser Met Gly Leu Ty r Met Gly Glu Thr Glu Thr Lys 

90 95 

Val Met Gly A sn Asp Leu Gly Phe p Gln a „ Gly ^ ^ ^ ^ 



105 



110 



Ser Ser G l y Glu Thr flsp Leu Leu ^ ^ ^ ^ ^ ^ ^ 



120 125 



UU SS S " HI Pr ° Glu *•» **• Ser ser Ala Ser 



140 



Thr Ala val Ser Ala Ala P ro Thr G lu Ly s G lu Phe Pro L ys Th r His 

155 160 

Ser Asp Val Ser Ser 



Ser Glu Gln Gln His Leu Lys Gly Gln Thr Gly Thr 
170 175 

Asn Gly Gly Asn Val Lys Leu Tvr Thr- tk. * 

180 Y Tof SP Gln Ser Thr Phe Asp 

185 19Q 

He Leu G l„ Asp Leu Glu Phe Ser Ser G l y ser Pro «» L ys Glu Thr 

200 2Q5 

Asn a„ ser Pro Trp Arg Ser Asp Leu Leu n e Asp Gl u Asn Cy s Leu 

215 220 

Leu Ser Pro Leu Ala Gl y Glu Asp Asp Ser phe Le(J ^ ^ ^ ^ 

235 24 0 

Ser Asn Glu Asp Cys Lys pro Leu ^ ^ ^ ^ ^ ^ ^ 



250 255 



He L ys Asp Asn Giy Asp Leu val ser ser pro flsn 

2 65 



270 



Leu Pro Gin Val L yS Thr Glu Lys Glu Asp phe ^ ^ ^ ^ 



280 285 



Pro ci y Vai Ile Lys G1 „ Lys ^ Gly ^ g 

295 300 

Ser Phe Pro Gly Ala Asn He He Gly Asn Lys Met Ser Ala lie Ser 
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305 310 315 



320 



Val His Gly Val Ser Thr Ser Gly Gly Gin Met Tyr His Tyr Asp Met 
325 330 335 

Asn Thr Ala Ser Leu Ser Gin Gin Gin Asp Gin Lys Pro He Phe Asn * 
340 345 350 

Val He Pro Pro He Pro Val Gly Ser Glu Asn Trp Asn Arg Cys Gin 
355 360 365 

Gly Ser Gly Asp Asp Asn Leu Thr Ser Leu Gly Thr Leu Asn The Pro 
370 375 380 

Gly Arg Thr Val Phe Ser Asn Gly Tyr Ser Ser Pro Ser Met Arg Pro 
385 390 395 400 

Asp Val Ser Ser Pro Pro Ser Ser Ser Ser Thr Ala Thr Thr Gly Pro 
405 410 415 

Pro Pro Lys Leu Cys Leu Val Cys Ser Asp Glu Ala Ser Gly Cys His 
420 425 430 

Tyr Gly Val Leu Thr Cys Gly Ser Cys Lys Val Phe Phe Lys Arg Ala 
435 440 445 

Val Glu Gly Gin His Asn Tyr Leu Cys Ala Gly Arg Asn Asp Cys He 
450 455 460 

He Asp Lys He Arg Arg Lys Asn Cys Pro Ala Cys Arg Tyr Arg Lys 
465 470 475 ' ' 480 

Cys Leu Gin Ala Gly Met Asn Leu Glu Ala Arg Lys Thr Lys Lys Lys 
485 490 495 

He Lys Gly He Gin Gin Ala Thr Thr Gly Val Ser Gin Glu Thr Ser 
500 505 510 

Glu Asn Pro Gly Asn Lys Thr He Val Pro Ala Thr Leu Pro Gin Leu 
515 520 525 

Thr Pro Thr Leu val Ser Xaa Leu Glu Xaa He Glu Pro Glu Val Leu 
530 535 540 

Tyr Ala Gly Tyr Asp Ser Ser Xaa Pro Asp Ser Thr Xaa Arg lie Met 
545 550 555 " 560 

Thr Thr Leu Asn Met Leu Gly Gly Arg Gin Val He Ala Ala Val Lys 
565 570 575 

Trp Ala Lys Ala He Pro Gly Phe Arg Asn Leu His Leu Asp Asp Gin 
580 585 590 

Met Thr Leu Leu Gin Tyr Ser Trp Met Xaa Leu Met Ala Phe Ala Leu 
595 600 605 

Gly Trp Arg Ser Tyr Arg Gin Ser Ser Ala Asn Leu Leu Cys Phe Ala 
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615 620 
Pro Asp Leu lie lie Asn ri„ m n * 

625 £2 Glu Gln Ar g Met Thr Xaa Pro Xaa Met Tyr 

635 640 



*3P «- CVS LyS hj Met Leu xaa val a ^ ^ u 

650 655 
Cln Val ser Xaa Glu Glu Tyr Leu ^ Lys ^ ^ ^ ^ 



665 670 



Ser Ser Val Pro Jr 



675 ^ ASP Gly ^ L * S S « an Glu xaa Phe Asp 



685 



Glu 



He ^ xaa Thr Tyr Ile L ^ Ma 

by!> 700 

«» CI, A sn Ser Ser G l„ Asn Xaa Gln Arg phe Tyr ^ ^ ^ ^ 



715 



720 



Leu Leu Asp Ser Met His Glu Val Val Glu t 

725 Leu Xaa Asn T y r Cys 

730 735 



Phe Cln Thr Ph e xaa Asp Lys Thr Ser ^ ^ ^ ^ ^ ^ 

/4b 750 
Leu *la G l» Ile Ile Thr Asn ne pro ^ 

760 7 65 



Ile Lys Lys Leu Leu Phe His Gln Lys 
770 7?5 



<210> 9 

<2U> 774 

<212> DNA 

<213> Homo sapiens 

<220> 

<221> CDS 

<222> <1)..(774) 

<223> 

<400> 9 



s s e s s s = - g s s a a s a «. « 

a ss s k s k a a s ss z % K « « - 

25 30 

5 S S S S 2J s s S S = - «. «• jg g. ,„ 
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40 45 
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egg caa gtg att gca gca gtg aaa tgg gca aag gca ata cca ggt ttc 

Arg Gin Val He Ala Ala Val Lys Trp Ala Lys Ala He Pro Gly Phe 

50 55 so 

agg aac tta cac ctg gat gac caa atg acc eta ctg cag tac tec tgg 

Arg Asn Leu His Leu Asp Asp Gin Met Thr Leu Leu Gin Tyr Ser Trp 
65 70 75 80 



aga atg act eta ccc tgc atg tac gac caa tgt aaa cac atg ctg tat 
Arg Met Thr Leu Pro Cys Met Tyr Asp Gin Cys Lys His Met Leu Tyr 
115 120 125 



gtt gaa aat etc ctt aac tat tgc ttc caa aca ttt ttg gat aag acc 
Val Glu Asn Leu Leu Asn Tyr Cys Phe Gin Thr Phe Leu Asp Lys Thr 
210 215 220 



192 



240 



atg ttt ctt atg gca ttt get ctg ggg tgg aga tea tat aga caa tea 288 

Met Phe Leu Met Ala Phe Ala Leu Gly Trp Arg Ser Tyr Arg Gin Ser 
85 90 95 

agt gca aac ctg ctg tgt ttt get cct gat ctg att att aat gag cag 336 

Ser Ala Asn Leu Leu Cys Phe Ala Pro Asp Leu He He Asn Glu Gin 

100 105 no 



384 



gtt tec tct gag tta cac agg ctt cag gta tct tat gaa gag tat etc 432 
Val Ser Ser Glu Leu His Arg Leu Gin Val Ser Tyr Glu Glu Tyr Leu 
13 ° 135 140 



480 



528 



tgt atg aaa acc tta ctg ctt etc tct tea gtt cct aag gac ggt ctg 

Cys Met Lys Thr Leu Leu Leu Leu Ser Ser Val Pro Lys Asp Gly Leu 
145 150 155 160 

aag age caa gag eta ttt gat gaa att aga atg acc tac ate aaa gag 

Lys Ser Gin Glu Leu Phe Asp Glu He Arg Met Thr Tyr He Lys Glu 
165 170 " 175 

eta gga aaa gec att gtc aag agg gaa gga aac tec age cag aac tgg 576 

Leu Gly Lys Ala lie Val Lys Arg Glu Gly Asn Ser Ser Gin Asn Trp 
180 185 190 

cag egg ttt tat caa ctg aca aaa etc ttg gat tct atg cat gaa gtg 624 

Gin Arg Phe Tyr Gin Leu Thr Lys Leu Leu Asp Ser Met His Glu Val 
I 95 200 205 



672 



atg agt att gaa ttc ccc gag atg tta get gaa ate ate acc aat cag 720 
Met Ser He Glu Phe Pro Glu Met Leu Ala Glu He lie Thr Asn Gin 
225 230 235 240 

ata cca aaa tat tea aat gga aat ate aaa aaa ctt ctg ttt cat caa 768 
lie Pro Lys Tyr Ser Asn Gly Asn He Lys Lys Leu Leu Phe His Gin 
245 250 255 

aag tga 7?4 
Lys 



<210> 10 

<211> 257 

<212> PRT 

<213> Homo sapiens 

<400> 10 

Val Pro Ala Thr Leu Pro Gin Leu Thr Pro Thr Leu Val Ser Leu Leu 
1 5 10 15 

Glu Val lie Glu Pro Glu Val Leu Tyr Ala Gly Tyr Asp Ser Ser Val 
20 25 30 
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Pro Asp Ser Thr Trp Arg n e Met Th»- tk, t 

35 H y Ae ™" Thr Thr L eu Asn Met Leu Gly Gly 

4U 45 

Arg Gin Val He Ala Ala Val Lys Trp Ala Lys Ala He Pro Gly Phe 

55 60 

Arg Asn Leu His Leu Asp Asp Gin „ et Thr Leu Leu Gin Tyr Ser Trp 

75 80 

Met Phe Leu Met Ala Phe Ala Leu Gly Trp Arg Ser Tyr Arg G1 „ S er 

90 95 

Ser Ala Asn Leu Leu Cys Phe ai* t 

100 ff? ASP LeU Ile Ile Asn Glu Gin 

105 110 

Arg Met Thr Leu Pro Cys Met Tvr a«r, m„ ^ 

115 yS Met As P Gln Lys His Met Leu Tyr 

120 125 

val ser ser Glu Leu His Arg Leu Gin Val Ser TyE H( Leu 

135 i40 

Cys Met Lys Thr Leu Leu Leu Leu Ser Ser Val Pro Lys Asp Gly Leu 

155 160 

Lys Ser Gin Glu Leu Phe Asp Glu lie Arg Met Thr Tyr He Lys Glu 

170 175 

Leu Gly Lys Ala lie Val Lys Arg Glu Gly Asn Ser Ser Gin Asn Trp 

185 190 

Gin Arg Phe Tyr Gin Leu Thr Lys Leu Leu Asp Ser Met His Glu Val 

200 205 

val Glu Asn Leu Leu Asn Tyr Cys Phe Gin Thr Phe Leu Asp Lys Thr 

215 220 

Met Ser He Glu Phe Pro Glu Met Leu Ala Glu lie He Thr Asn Gin 



235 



240 



He Pro Lys Tyr Ser Asn Gly Asn He Lys Lys Leu Leu Phe His Gin 

250 255 

Lys 



<210> 11 

<211> 1548 

<212> DNA 

<213> Homo sapiens 

<220> 

<221> CDS 
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<222> (1)..(774) 
<223> 



<400> 11 

gtt cct gca acq tta cca caa etc acc cct acc ctg gtg tea ctg ttg 48 
Val Pro Ala Thr Leu Pro Gin Leu Thr Pro Thr Leu Val Ser Leu Leu 
15 10 15 



gag gtt att gaa cct gaa gtg tta tat gca gga tat gat age tct gtt 
Glu Val He Glu Pro Glu Val Leu Tyr Ala Gly Tyr Asp Ser Ser Val 
20 25 30 



gtt tec tct gag tta cac agg ctt cag gta tct tat gaa gag tat etc 
Val Ser Ser Glu Leu His Arg Leu Gin Val Ser Tyr Glu Glu Tyr Leu 
130 135 140 



gtt gaa aat etc ctt aac tat tgc ttc caa aca ttt ttg gat aag acc 
Val Glu Asn Leu Leu Asn Tyr Cys Phe Gin Thr Phe Leu Asp Lys Thr 
210 215 220 



96 



cca gac tea act tgg agg ate atg act acg etc aac atg tta gga ggg 144 
Pro Asp Ser Thr Trp Arg He Met Thr Thr Leu Asn Met Leu Gly Gly 
35 40 45 

egg caa gtg att gca gca gtg aaa tgg gca aag gca ata cca ggt ttc 192 
Arg Gin Val He Ala Ala Val Lys Trp Ala Lys Ala He Pro Gly Phe 
50 55 60 

agg aac tta cac ctg gat gac caa atg acc eta ctg cag tac tec tgg 240 
Arg Asn Leu His Leu Asp Asp Gin Met Thr Leu Leu Gin Tyr Ser Trp 
65 70 75 80 

atg tec ctt atg gca ttt get ctg ggg tgg aga tea tat aga caa tea 288 
Met Ser Leu Met Ala Phe Ala Leu Gly Trp Arg Ser Tyr Arg Gin Ser 
85 90 95 

agt gca aac ctg ctg tgt ttt get cct gat ctg att att aat gag cag 336 
Ser Ala Asn Leu Leu Cys Phe Ala Pro Asp Leu He He Asn Glu Gin 
100 105 110 

aga atg act eta ccc tgc atg tac gac caa tgt aaa cac atg ctg tat 384 
Arg Met Thr Leu Pro Cys Met Tyr Asp Gin Cys Lys His Met Leu Tyr 
115 120 125 



432 



tgt atg aaa acc tta ctg ctt etc tct tea gtt cct aag gac ggt ctg 480 
Cys Met Lys Thr Leu Leu Leu Leu Ser Ser Val Pro Lys Asp Gly Leu 
145 150 155 160 

aag age caa gag eta ttt gat gaa att aga atg acc tac ate aaa gag 528 
Lys Ser Gin Glu Leu Phe Asp Glu He Arg Met Thr Tyr He Lys Glu 
165 170 175 

eta gga aaa gee att gtc aag agg gaa gga aac tec age cag aac tgg 576 
Leu Gly Lys Ala He Val Lys Arg Glu Gly Asn Ser Ser Gin Asn Trp 
180 185 190 

cag egg ttt tat caa ctg aca aaa etc ttg gat tct atg cat gaa gtg 624 
Gin Arg Phe Tyr Gin Leu Thr Lys Leu Leu Asp Ser Met His Glu Val 
195 200 205 



672 



atg agt att gaa ttc ccc gag atg tta get gaa ate ate acc aat cag 720 
Met Ser He Glu Phe Pro Glu Met Leu Ala Glu He He Thr Asn Gin 
225 230 235 240 

ata cca aaa tat tea aat gga aat ate aaa aaa ctt ctg ttt cat caa 768 
lie Pro Lys Tyr Ser Asn Gly Asn He Lys Lys Leu Leu Phe His Gin 
245 250 255 

aag tga gttcctgcaa cgttaccaca actcacccct accctggtgt cactgttgga 824 
Lys 

ggttattgaa cctgaagtgt tatatgeagg atatgatagc tctgttccag actcaacttg 884 
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gaggatcatg 
ggcaaaggca 
gtactcctgg 
tgcaaacctg 
ctgcatgtac 
ggtatcttat 
ggacggtctg 
aggaaaagcc 
actgacaaaa 
ccaaacattt 
caccaatcag 
gtga 



actacgctca 
ataccaggtt 
atgtccctta 
ctgtgttttg 
gaccaatgta 
gaagagtatc 
aagagccaag 
attgtcaaga 
ctcttggatt 
ttggataaga 
ataccaaaat 



acatgttagg 
tcaggaactt 
tggcatttgc 
ctcctgatct 
aacacatgct 
tctgtatgaa 
agctatttga 
gggaaggaaa 
ctatgcatga 
ccatgagtat 
attcaaatgg 



agggcggcaa 
acacctggat 
tctggggtgg 
gattattaat 
gtatgtttcc 
aaccttactg 
tgaaattaga 
ctccagccag 
agtggttgaa 
tgaattcccc 
aaatatcaaa 



gtgattgcag 
gaccaaatga 
agatcatata 
gagcagagaa 
tctgagttac 
cttctctctt 
atgacctaca 
aactggcagc 
aatctcctta 
gagatgttag 
aaacttctgt 



cagtgaaatg 
ccctactgca 
gacaatcaag 
tgactctacc 
acaggcttca 
cagttcctaa 
tcaaagagct 
ggttttatca 
actattgctt 
ctgaaatcat 
ttcatcaaaa 



<210> 12 

<211> 257 

<212> prt 

<213> Homo sapiens 



944 
1004 
1064 
1124 
1184 
1244 
1304 
1364 
1424 
1484 
1544 
1548 



<400> 12 

VI Pro Ala Thr , eu Pro Gln Leu l ^ l ^ 



10 



15 



«« Val Ile G l u Pro Glu Val L eu Tyr flia Gly Tyr flsp Ser Ser ^ 

25 30 

Pro Asp ser Thr rrp Arg Ile Thr ^ ^ ^ ^ ^ 

qU 45 

^ so" Ile Ala Ala It 1 LyS ^ ^ «*■ Ue Pro Gly Phe 

35 60 

65 9 flS " L6U MCt ThC «« «~ «- Ser Trp 

75 80 

Met Ser Leu Met Ma Phe Ala Leu Gly ^ ^ ^ ^ ^ ^ 



90 



95 



Gln 



Ser Ala Asn Leu Leu Cvs Ph* ai a * 

100 f r ° Asp Leu Ile *l e Asn Glu 

105 110 

Arg Met Thr Leu Pro Cys Met Tyr Asp Gln Cys Lys His Met Leu ^ 



125 



13$ LGU H±S Arg Leu 



Gln Val Ser Tyr Glu Glu Tyr Leu 
135 140 



Cys Met Lys Thr Leu Leu Leu Leu Ser Ser Val Pro Lys 



155 



Asp Gly Leu 
160 



MSDOCID: <WO_03015692A2_I_> 



WO 03/015692 PCT/US02/22648 

31/57 



Lys Ser Gin Glu Leu Phe Asp Glu lie Arg Met Thr Tyr He Lys Glu 
165 170 ' 175 

Leu Gly Lys Ala He Val Lys Arg Glu Gly Asn Ser Ser Gin Asn Trp 
180 185 190 

Gin Arg Phe Tyr Gin Leu Thr Lys Leu Leu Asp Ser Met His Glu Val 
195 200 205 

Val Glu Asn Leu Leu Asn Tyr Cys Phe Gin Thr Phe Leu Asp Lys Thr 
21 <> 215> 220 

Met Ser He Glu Phe Pro Glu Met Leu Ala Glu He He Thr Asn Gin 
225 2 30 235 240 

He Pro Lys Tyr Ser Asn Gly Asn He Lys Lys Leu Leu Phe His Gin 
245 250 255 

Lys 



<210> 13 

<211> 774 

<212> DNA 

<213> Homo sapiens 

<220> 

<221> CDS 

<222> (1)..{774> 

<223> 



<400> 13 

gtt cct gca acg tta cca caa etc acc cct acc ctg gtg tea ctg ttg 

Val Pro Ala Thr Leu Pro Gin Leu Thr Pro Thr Leu Val Ser Leu Leu 

1 * 10 15 



48 



gag gtt att gaa cct gaa gtg tta tat gca gga tat gat age tct gtt 96 
Glu Val He Glu Pro Glu Val Leu Tyr Ala Gly Tyr Asp Ser Ser Val 
20 25 30 

cca gac tea act tgg agg ate atg act acg etc aac atg tta gga ggg 144 

Pro Asp Ser Thr Trp Arg He Met Thr Thr Leu Asn Met Leu Gly Gly 
35 40 45 

egg caa gtg att gca gca gtg aaa tgg gca aag gca ata cca ggt ttc 192 

Arg Gin Val He Ala Ala Val Lys Trp Ala Lys Ala He Pro Gly Phe 
50 55 60 

agg aac tta cac ctg gat gac caa atg acc eta ctg cag tac tec tgg 240 

Arg Asn Leu His Leu Asp Asp Gin Met Thr Leu Leu Gin Tyr Ser Trp 
65 "?0 75 " 80 

atg gac ctt atg gca ttt get ctg ggg tgg aga tea tat aga caa tea 288 

Met Asp Leu Met Ala Phe Ala Leu Gly Trp Arg Ser Tyr Arg Gin Ser 

85 90 95 
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30 



Pro Asp Ser Thr Trp Arg Ile „ t Thr Thr Leu flsn ^ ^ 

40 45 

Arg Gin Val He Ala Ala v l Lys Trp ^ ^ ^ ^ ^ ^ 

55 60 

Arg Asn Leu His Leu Asp Asp Gin Met Thr Leu Leu Gin Tyr Ser Trp 



384 



432 



s - s = = a - s s s s a ffi ffi = s a « 

105 110 

ffi k ffi ffi s ffi ffi ffi s; ffi ffi 5 ffi ffi ffi ffi 

ffi ffi £ a £ S S ffi a ffi E ffi ffi ffi ffi £ 
ffi ffi S S £ 2 E S 2 ffi ffi S ffi S R 2 « 

155 160 

ffi ffi = a k ffi s e s s ffi ffi ffi s s s - 

ffi ffi S E ffi ffi 3 ;?f e S ffi ffi S ffi £ S 5 " 

185 19Q 

Gin Arg 9 Ph" J£ Si £ £ «J "J gat tot atg cat gaa gtg 

195 ASP Ser Met His Glu Val 

200 205 

va" Si As" Leu Leu Asn Tyr clt rf* ttt tt9 gat aa * 

2iQ MU Asn Tyr Cys Phe Gin Thr Phe Leu Asp Lys Thr 

£15 220 

ffi ffi ffi k s s a ffi 2:: k e ffi s ffi s a - 

ffi ffi ffi ffi ffi £ I?; L*S S ffi S 2 a S ffi S '- 



255 
aag tga 

Lys 774 



<210> 14 

<211> 257 

<212> prt 

<213> Homo sapiens 

<400> 14 

val Pro Ala Thr Leu Pro Gin Leu Thr Pro Thr Leu Val Ser Leu Leu 

10 15 

Glu Val lie Glu Pro Glu Val Leu Tyr Ala Gly Tyr Asp Ser Ser Val 



70 no 

75 80 
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Met Asp Leu Met Ala Phe Ala Leu Gly Trp Arg Ser Tyr Arg Gin Ser 
85 90 95 

Ser Ala Asn Leu Leu Cys Phe Ala Pro Asp Leu He He Asn Glu Gin 
100 105 HO 

Arg Met Thr Leu Pro Cys Met Tyr Asp Gin Cys Lys His Met Leu Tyr 
115 120 " 125 

Val Ser Ser Glu Leu His Arg Leu Gin Val Ser Tyr Glu Glu Tyr Leu 
130 135 140 

Cys Met Lys Thr Leu Leu Leu Leu Ser Ser Val Pro Lys Asp Gly Leu 
145 150 155 160 

Lys Ser Gin Glu Leu Phe Asp Glu He Arg Met Thr Tyr He Lys Glu 
165 170 175 

Leu Gly Lys Ala He Val Lys Arg Glu Gly Asn Ser Ser Gin Asn Trp 
180 185 190 

Gin Arg Phe Tyr Gin Leu Thr Lys Leu Leu Asp Ser Met His Glu Val 
195 200 1 205 

Val Glu Asn Leu Leu Asn Tyr Cys Phe Gin Thr Phe Leu Asp Lys Thr 
210 215 220 

Met Ser He Glu Phe Pro Glu Met Leu Ala Glu He He Thr Asn Gin 
225 230 235 240 

He Pro Lys Tyr Ser Asn Gly Asn He Lys Lys Leu Leu Phe His Gin 
2 45 250 255 

Lys 



<210> 15 

<211> 774 

<212> DNA 

<213> Homo sapiens 

<220> 

<221> CDS 

<222> (1) . . (774) 

<223> n = a or c or g or t/u such that Xaa at positions 552, 557, 602, 
636, 648, 712, 741, 535, 538, 638, 691, 702, 648, 660, 685, 733 < 
nd 764 can indpendently be Cys, Asn, Tyr, Lys, Ser, Asp, 

Glu, Gin, Arg or Thr. 



<220> 



SDOCID: <WO_0301 5692A2 J_> 



WO 03/015692 

<221> misc_feature 
<222> (1)..(774) 
<223> n = a or 



PC77US02/22648 

34/57 



nd 764 can i pe L ' b f ' 69 i' ?02 ' 648 ' ^ «5. 733 a 

. Arg or Thr Y * ° yS ' Asn ' Tyr ' L V S ' Se r, Asp, Glu, Gin 



<400> 15 



96 



144 



192 



S S E S S 5 = S S £ £ K S K = 2 

10 15 

gag nnn att gaa cct aaa m-n t- + * «- «. 

Glu Xaa lie Glu Pro llu vll Leu Tvr 111 T 5" g " * gC tCt n ™ 
20 ^ Ala Gly As P Ser Ser Xaa 

25 30 

S S {= S = 2! £ » « S « ~ g «. «. 

a I* « S S S ST s s s s s s s 2?J s 

55 60 

si a s: a s s = a s k a a ~ = s «. 

75 80 

a = Si a s a s a s 3 S = i'l s S £ !S » 

90 95 

a k = a jr s k s s a s s 2 k a » 

105 110 
aga atg act nnn ccc nnn atg tac aac r flfl aa 

Arg Met Thr Xaa Pro Xaa Met Tvr ItZ r?« ^ 9 f ° aC atg Ctg nnn 384 

115 Met As P Gln Cys Lys His Met Leu Xaa 

120 125 

a s s a a s a s a s e = a a a a « 



a a s s s a s: s ss s ss s s k a a «° 

155 160 

£ a s; is s s a s; s s a s ;« s s a - 

170 175 
eta gga aaa gec att nnn aag agq aaa aaa 

Leu Gly Lys jx Ile xaa Ly * « |« «J «c tec age ca, aac „„„ 

185 190 

a a a iz K ss s s s - s a a s s a 

200 205 

a c s « = - « s s - - s = z = 

215 220 

Mel £ SS Phe C trt 111 £l ^ ff* " C » tC acc ca g 

oxu Phe Pro Glu Met Leu Ala Glu Ile Ile Thr Asn Gin 

230 235 240 

a s k = a 2 b a s s s « a a s s 



576 



624 



672 
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aag tga 774 
Lys 



<210> 16 

<211> 257 

<212> PRT 

<213> Homo sapiens 



<220> 

<221> misc_feature 
<222> (15).. (15) 

<223> The 'Xaa' at location 15 stands for Cys, Asn, Tyr, Lys, Ser, Asp, 
Glu, Gin, Arg or Thr. 

<220> 

<221> misc_feature 
<222> (18). .(18) 

<223> The 'Xaa' at location 18 stands for Cys, Asn, Tyr, Lys, Ser, Asp, 
Glu, Gin, Arg or Thr. 

<220> 

<221> misc_f eature 
<222> (32).. (32) 

<223> The 'Xaa* at location 32 stands for Cys, Asn, Tyr, Lys, Ser, Asp, 
Glu, Gin, Arg or Thr. 

<220> 

<221> misc_f eature 
<222> (37) . . (37) 

<223> The 'Xaa' at location 37 stands for Cys, Asn, Tyr, Lys, Ser, Asp, 
Glu, Gin, Arg or Thr. 

<220> 

<221> misc_feature 
<222> (82) . . (82) 

<223> The 'Xaa* at location 82 stands for Cys, Asn, Tyr, Lys, Ser, Asp, 
Glu, Gin, Arg or Thr. 

<220> 

<221> misc_f eature 
<222> (116) . . (116) 

<223> The 'Xaa' at location 116 stands for Cys, Asn, Tyr, Lys, Ser, Asp, 
Glu, Gin, Arg or Thr. 
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36/^7 

<220> 

<221> misc_feature 
<222> (lie) . . (ii 8) 

oll^ll^rTor ?L l0Cati ° n 118 St - dS ** ty*> Asn, Tyr, Lys , Ser, Asp, 
<220> 

<221> misc feature 
<222> (128) . . (128) 

OlToxlT^Tt'r ?L l0Cati ° n 128 St3ndS *« <y*> ^n, Tyr . Lys , ser , Asp , 
<220> 

<221> misc_f eature 
<222> (140) . . (140) 

?L location 140 stanas c ^ Tyr/ Lys , ser , ASP/ 

<220> 

<221> misc_f eature 
<222> (165) . . (165) 

Gl^Gir^*; ^ r l0Cati ° n StandS f ° r Tyr, Lys , Ser , Asp/ 

<220> 

<221> misc__f eature 
<222> (171) (171) 

2£ ?L l0Cati ° n StandS C * s ' Tyr. Lys, Ser, Asp , 

<220> 

<221> misc_feature 
<222> (182) . . (182) 

Glu^r*^; ^r! 00 " 10 " 182 St3ndS f ° r C * s ' Tyr , Lys , ser , Asp , 

<220> 

<221> misc_feature 
<222> (192) . . (192) 

2?«?\£^ a L!° Cati ° n 192 Sta " dS f ° r C ^ S ' «y«. Ly. S er , Asp , 

<220> 

<221> misc_feature 
<222> (213) . . (213) 
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<223> The 'Xaa' at location 213 stands for Cys, Asn, Tyr, Lys, Ser, Asp, 
Glu, Gin, Arg or Thr. 

<220> 

<221> misc_feature 

« 

<222> (221) . . (221) 

<223> The 'Xaa' at location 221 stands for Cys, Asn, Tyr, Lys, Ser, Asp, 
Glu, Gin, Arg or Thr. 

<220> 

<221> misc_feature 
<222> (244) . . (244) 

<223> The 'Xaa' at location 244 stands for Cys, Asn, Tyr, Lys, Ser, Asp, 
Glu, Gin, Arg or Thr. 

<220> 

<221> misc_feature 
<222> (1)..(774) 

<223> n = a or c or g or t/u such that Xaa at positions 552, 557, 602, 
636, 648, 712, 741, 535, 538, 638, 691, 702, 648, 660, 685, 733 a 
nd 764 can indpendently be Cys, Asn, Tyr, Lys, Ser, Asp, Glu, Gin 
, Arg or Thr 

<400> 16 

Val Pro Ala Thr Leu Pro Gin Leu Thr Pro Thr Leu Val Ser Xaa Leu 
1 5 10 15 

Glu Xaa lie Glu Pro Glu Val Leu Tyr Ala Gly Tyr Asp Ser Ser Xaa 
20 25 30 

Pro Asp Ser Thr Xaa Arg He Met Thr Thr Leu Asn Met Leu Gly Glv 
35 40 45 

Arg Gin Val He Ala Ala Val Lys Trp Ala Lys Ala He Pro Gly Phe 
50 55 60 

Arg Asn Leu His Leu Asp Asp Gin Met Thr Leu Leu Gin Tyr Ser Trp 
65 70 75 80 

Met Xaa Leu Met Ala Phe Ala Leu Gly Trp Arg Ser Tyr Arg Gin Ser 
85 90 ' 95 

Ser Ala Asn Leu Leu Cys Phe Ala Pro Asp Leu He He Asn Glu Gin 
100 105 no 

Arg Met Thr Xaa Pro Xaa Met Tyr Asp Gin Cys Lys His Met Leu Xaa 
115 120 125 

Val Ser Ser Glu Leu His Arg Leu Gin Val Ser Xaa Glu Glu Tyr Leu 
130 135 140 

Cys Met Lys Thr Leu Leu Leu Leu Ser Ser Val Pro Lys Asp Gly Leu 
145 - 150 155 ' 160 
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Lys Ser Gin Glu Xaa Phe Asp Glu Ile Arg Xaa Thr Tyr ^ ^ ^ 

170 175 

Leu Gly Lys Ala He Xaa Lys Arc, Glu Gly fts „ Ser £er ^ ^ 

185 190 



Gin Ar g Phe Tyr Gin Leu Thr L Leu Leu fisp Ser ^ ^ ^ ^ 

200 205 

val Glu Asn Leu Xaa A S „ Tyr Cys Phe Gin Thr Phe Xaa flsp Lys Thr 

215 220 

Met Ser He Glu Phe Pro Glu Met Leu Ala Glu lie He Thr Asn Gin 

235 240 

He Pro Lys Xaa Ser Asn Gly Asn He Lys Lys Leu Leu Phe His Gin 

250 255 

Lys 

<210> 17 

<211> 25 

<212> PRT 

<213> Homo sapiens 



<400> 17 

Gin Glu Pro Val Ser Pro Lys Lys Lys Glu Asn Ala Leu 



10 



Leu Arg Tyr 
15 



Leu Leu Asp Lys Asp Asp Thr Lys Asp 



20 

<210> 18 

<211> 5 

<212> prt 

<213> Homo sapiens 

<220> 

<221> miscfeature 

<222> (1)..(5) 

<223> Xaa is any amino acid 

<400> 18 

Leu Xaa Xaa Leu Leu 
1 5 



25 



PCT/US02/22648 
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<210> 19 

<211> 67 

<212> DNA 

<213> Homo sapiens 

<400> 19 

cggcggcgcc atatgaaaaa aggtcatcat catcatcatc atggttcccc tatactaggt 60 
tattgga 



67 



<210> 20 

<211> 33 

<212> DNA 

<213> Homo sapiens 



<400> 20 

cggcggcgcg gatccacgcg gaaccagatc cga 33 

<210> 21 

<211> 237 

<212> prt 

<213> Homo sapiens 

<400> 21 

Met Lys Lys Gly His His His His His His His Gly Ser Pro lie Leu 
1 5 10 15 

Gly Tyr Trp Lys He Lys Gly Leu Val Gin Pro Thr Arg Leu Leu Leu 
20 25 30 

Glu Tyr Leu Glu Glu Lys Tyr Glu Glu His Leu Tyr Glu Arg Asp Glu 
35 40 45 

Gly Asp Lys Trp Arg Asn Lys Lys Phe Glu Leu Gly Leu Glu Phe Pro 
50 55 60 

Asn Leu Pro Tyr Tyr He Asp Gly Asp Val Lys Leu Thr Gin Ser Met 
65 70 75 80 

Ala He He Arg Tyr He Ala Asp Lys His Asn Met Leu Gly Gly Cys 
85 90 95 

Pro Lys Glu Arg Ala Glu He Ser Met Leu Glu Gly Ala Val Leu Asp 
100 io5 110 

He Arg Tyr Gly Val Ser Arg He Ala Tyr Ser Lys Asp Phe Glu Thr 
115 120 125 

Leu Lys Val Asp phe Leu Ser Lys Leu Pro Glu Met Leu Lys Met Phe 
130 135 140 
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Glu Asp Arg Leu Cys His Lys Thr Tyr Leu Asn Gly Asp „ is Val Thr 

155 160 

His Pro Asp Phe „et Leu Tyr Asp Ala Leu Asp Val Vai Leu Tyr Met 

170 175 

Asp Pro Met Cys Leu Asp Ala Phe Pro Lys Leu Val Cys P he Lys Lys 

185 190 

Ar g lie Glu Ala l le Pro Gin lie Asp Lys Tyr Leu Lys Ser Ser Lys 

200 205 

Tyr lie Ala Trp Pro Leu Gin Gly Trp Gin Ala Thr Phe Gly Gly Gly 



220 



Asp His Pro Pro Lys Ser Asp Leu Val Pro Arg Gly Ser 
230 235 

<210> 22 

<211> 32 

<212> DNA 

<213> Homo sapiens 

<400> 22 

tactcctgga tgtcccttat ggcatttgct ct 

<210> 23 

<211> 32 

<212> DNA 

<213> Homo sapiens 

<400> 23 

agagcaaatg ccataaggga catccaggag ta 

<210> 24 

<211> 32 

<212> DNA 

<213> Homo sapiens 

<400> 24 

tactcctgga tggaccttat ggcatttgct ct 

<210> 25 

<211> 32 

<212> DNA 

<213> Homo sapiens 



PCT/US02/22648 



32 



32 



32 



MSDOCID: < WO_0301 5692A2_I_> 



WO 03/015692 PCT/US02/22648 

41/57 



32 



<400> 


25 


agagcaaatg 


<210> 


26 


<211> 


252 


<212> 


PRT 


<213> 


Homo 


<400> 


26 



Ala Leu Thr Pro Ser Pro Val Met Val Leu Glu Asn He Glu Pro Glu 
15 10 15 

He val Tyr Ala Gly Tyr Asp Ser Ser Lys Pro Asp Thr Ala Glu Asn 
20 25 30 

Leu Leu Ser Thr Leu Asn Arg Leu Ala Gly Lys Gin Met He Gin Val 
35 40 45 

Val Lys Trp Ala Lys Val Leu Pro Gly Phe Lys Asn Leu Pro Leu Glu 
50 55 60 

Asp Gin lie Thr Leu He Gin Tyr Ser Trp Met Cys Leu Ser Ser Phe 
65 70 75 80 

Ala Leu Ser Trp Arg Ser Tyr Lys His Thr Asn Ser Gin Phe Leu Tyr 
85 90 95 

Phe Ala Pro Asp Leu Val Phe Asn Glu Glu Lys Met His Gin Ser Ala 
100 105 no 

Met Tyr Glu Leu Cys Gin Gly Met His Gin He Ser Leu Gin Phe Val 
115 120 125 

Arg Leu Gin Leu Thr Phe Glu Glu Tyr Thr He Met Lys Val Leu Leu 
130 135 140 

Leu Leu Ser Thr He Pro Lys Asp Gly Leu Lys Ser Gin Ala Ala Phe 
145 150 155 160 

Glu Glu Met Arg Thr Asn Tyr He Lys Glu Leu Arg Lys Met Val Thr 
165 170 ' 175 

Lys Cys Pro Asn Asn Ser Gly Gin Ser Trp Gin Arg Phe Tyr Gin Leu 
180 185 190 

Thr Lys Leu Leu Asp Ser Met His Asp Leu Val Ser Asp Leu Leu Glu 
195 200 205 

Phe Cys Phe Tyr Thr Phe Arg Glu Ser His Ala Leu Lys Val Glu Phe 
210 215 220 



Pro Ala Met Leu Val Glu He He Ser Asp Gin Leu Pro Lys Val Glu 
225 230 235 240 
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Ser Gly ash Ala Lys Pro Leu Tyr Phe His Arg Lys 



245 250 



<210> 27 

<211> 252 

<212> PRT 

<213> Homo sapiens 

<400> 27 



Gin Leu lie Pro Pro Leu He Asn Leu Leu Met Ser He ciu Pro Asp 

10 15 

Val He Tyr Ala Gly His Asp Asn Thr Lys Pro Asp Thr Ser Ser Ser 



25 



30 



Leu Leu Thr Ser Leu Asn Gin Leu Gly Glu Arg Gin Leu Leu Ser Val 

40 45 

val Lys T rp Ser Lys Ser Leu Pro Gly Phe Arg Asn Leu His He Asp 
Asp Gin He Thr Leu lie Gin Tyr Ser Trp Met Ser Leu Met Val Phe 



75 



80 



Gly Leu Gly Trp Arg Ser Tyr Lys His Val Ser Gly Gin Met Leu Tyr 

85 90 95 

Pne Ala Pro Asp Leu He Leu Asn Glu Gin Arg Met Lys Glu Ser Ser 

105 110 



Phe Tyr Ser Leu Cys Leu Thr Met Trp Gin He Pro Gin Glu 

120 125 

Lys Leu Gin Val Ser Gin Glu Glu Phe Leu Cys Met Lys Val Leu Leu 

135 140 



Phe val 



Leu Leu Asn Thr He Pro Leu Glu Gly Leu Arg Ser Gin Thr Gin Phe 
150 155 160 

Glu Glu Met Arg Ser Ser Tyr He Arg Glu Leu He Lys Ala He Gly 
165 170 



175 



Leu Arg Gin Lys Gly Val Val Ser Ser Ser Gin Arg Phe Tyr Gin Leu 

185 190 

Thr Lys Leu Leu Asp Asn Leu His Asp Leu Val Lys Gin Leu His Leu 

200 205 

Tyr Cys Leu Asn Thr Phe lie Gin Ser Arg Ala Leu Ser Val Glu Phe 

215 220 

Pro Glu Met Met ser Glu val He Ala Ala Gin Leu Pro Lys lie Leu 
230 2 35 240 



PCT/US02/22648 
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Ala Gly Met Val Lys Pro Leu Leu Phe His Lys Lys 
245 250 



<210> 


28 


<211> 


252 


<212> 


PRT 


<213> 


Homo 


<400> 


28 



Glu Cys Gin Pro He Phe Leu Asn Val Leu Glu Ala He Glu Pro Gly 
1 5 10 15 

Val Val Cys Ala Gly His Asp Asn Asn Gin Pro Asp Ser Phe Ala Ala 
20 25 30 

Leu Leu Ser Ser Leu Asn Glu Leu Gly Glu Arg Gin Leu Val His Val 
35 40 45 

Val Lys Trp Ala Lys Ala Leu Pro Gly Phe Arg Asn Leu His Val Asp 
50 55 60 

Asp Gin Met Ala Val He Gin Tyr Ser Trp Met Gly Leu Met Val Phe 
65 7 0 75 80 

Ala Met Gly Trp Arg Ser Phe Thr Asn Val Asn Ser Arg Met Leu Tyr 
85 90 95 

Phe Ala Pro Asp Leu Val Phe Asn Glu Tyr Arg Met His Lys Ser Arg 
100 105 HO 

Met Tyr Ser Gin Cys Val Arg Met Arg His Leu Ser Gin Glu Phe Gly 
115 120 125 

Trp Leu Gin lie Thr Pro Gin Glu Phe Leu Cys Met Lys Ala Leu Leu 
130 135 140 

Leu Phe Ser He He Pro Val Asp Gly Leu Lys Asn Gin Lys Phe Phe 
145 150 155 160 

Asp Glu Leu Arg Met Asn Tyr He Lys Glu Leu Asp Arg He He Ala 
165 170 175 

Cys Lys Arg Lys Asn Pro Thr Ser Cys Ser Arg Arg Phe Tyr Gin Leu 
180 185 190 

Thr Lys Leu Leu Asp Ser Val Gin Pro He Ala Arg Glu Leu His Gin 
195 200 205 

Phe Thr Phe Asp Leu Leu He Lys Ser His Met Val Ser Val Asp Phe 
210 215 220 

Pro Glu Met Met Ala Glu He He Ser Val Gin Val Pro Lys He Leu 
225 230 235 240 
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Ser Gly Lys Val Lys Pro He Tyr Phe His Thr Gin 



245 250 



<210> 29 

<211> 286 

<212> prt 

<213> Homo sapiens 

<400> 29 

Leu Thr Ala Asp Gin Met Val Ser Ala Leu Leu Asp Ala 
5 10 



Glu Pro Pro 



15 

He Leu Tyr Ser Gl u Ty r Asp Pro Thr Arg Pro Phe Ser Glu Ala Ser 



20 

30 



25 



Met Met Gly Leu Leu Thr Asn Leu Ala Asp Arg Glu Leu Val His Met 

40 45 

He Asn Trp Ala Lys Arg Val Pro Gly Phe Val Asp Leu Thr Leu His 



Asp Gin val His Leu Leu Glu Cys Ala Trp Leu Glu He Leu Met He 
70 " so 

Gly Leu Val Trp Arg Ser Met Glu His Pro Gly Lys Leu Leu Phe Ala 
85 90 95 

Pro Asn Leu Leu Leu Asp Arg Asn Gin Gly Lys Cys Val Glu Gly Met 

105 110 

Val Glu lie Phe Asp Met Leu Leu Ala Thr Ser Ser Arg Phe Arg Met 

120 125 

Met Asn Leu Gin Gly Glu Glu Phe Val Cys Leu Lys Ser He He Leu 

135 140 

Leu Asn ser Gly Val Tyr Thr Phe Leu Ser Ser Thr Leu Lys Ser Leu 
50 155 160 

Glu Glu Lys Asp His He His Arg Val Leu Asp Lys He Thr Asp Thr 
1 bb 170 



175 



Leu He His Leu Met Ala Lys Ala Gly Leu Thr Leu Gin Gin Gin His 
lou 185 



190 



Gin Arg Leu Ala Gin Leu Leu Leu He Leu Ser His He Arg His Met 



200 205 



Ser Asn Lys Gly Met Glu His Leu Tyr Ser Met Lys Cys Lys Asn Val 

21 5 220 

val Pro Leu Tyr Asp Leu Leu Leu Glu Met Leu Asp Ala His Arg Leu 
230 235 240 
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His Ala Pro Thr Ser Arg Gly Gly Ala Ser Val Glu Glu Thr Asp Gin 

245 250 255 

Ser His Leu Ala Thr Ala Gly Ser Thr Ser Ser His Ser Leu Gin Lys 
260 265 270 

Tyr Tyr lie Thr Gly Glu Ala Glu Gly Phe Pro Ala Thr Val 

280 2 85 





275 


<210> 


30 


<211> 


268 


<212> 


PRT 


<213> 


Homo 


<400> 


30 



Leu Ser Pro Glu Gin Leu Val Leu Thr Leu Leu Glu Ala Glu Pro Pro 
1 5 10 15 

His Val Leu lie Ser Arg Pro Ser Ala Pro Phe Thr Glu Ala Ser Met 
20 25 30 

Met Met Ser Leu Thr Lys Leu Ala Asp Lys Glu Leu Val His Met lie 
35 40 45 

Ser Trp Ala Lys Lys He Pro Gly Phe Val Glu Leu Ser Leu Phe Asp 
50 55 60 

Gin Val Arg Leu Leu Glu Ser Cys Trp Met Glu Val Leu Met Met Gly 
65 70 75 80 

Leu Met Trp Arg Ser He Asp His Pro Gly Lys Leu He Phe Ala Pro 
85 90 95 

Asp Leu Val Leu Asp Arg Asp Glu Gly Lys Cys Val Glu Gly He Leu 
100 105 110 

Glu He Phe Asp Met Leu Leu Ala Thr Thr Ser Arg Phe Arg Glu Leu 
115 120 125 

Lys Leu Gin His Lys Glu Tyr Leu Cys Val Lys Ala Met He Leu Leu 
130 135 140 

Asn Ser Ser Met Tyr Pro Leu Val Thr Ala Thr Gin Asp Ala Asp Ser 
145 150 155 160 



Ser Arg Lys Leu Ala His Leu Leu Asn Ala Val Thr Asp Ala Leu Val 
165 170 175 

Trp Val He Ala Lys Ser Gly He Ser Ser Gin Gin Gin Ser Met Arg 
180 185 190 



Leu Ala Asn Leu Leu Met Leu Leu Ser His Val Arg His Ala Ser Asn 
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195 



200 



205 

Lys Gly Met Glu His Leu Leu Asn Met Lys C ys Lys Asn Val Val Pro 



215 220 



val Tyr Asp Leu Leu ^ a „ Met Leu Agn ^ ^ ^ ^ ^ ^ 

235 240 

Cys Lys Ser Ser lie Thr Gly S er Glu Cys Ser Pro Ala Glu Asp Ser 

250 



255 



Lys Ser Lys Glu Gly Ser Gin Asn Pro Gin Se 



265 



r Gin 



260 

<210> 31 

<211> 251 

<212> PRT 

<213> Homo sapiens 



<400> 31 

Gin Leu Thr Pro Thr Leu Val Ser Leu Leu Glu Val lie Glu 



10 15 



Pro Glu 



Val Leu Tyr Ala Gly Tyr Asp Ser Ser Val Pro Asp Ser Thr Trp Arg 

25 30 

He Met Thr Thr Leu Asn Met Leu Gly Gly Arc, Gin Val He Ala Ala 

val Lys Trp Ala Lys Ala lie Pro Gly Phe Arg Asn Leu His Leu Asp 

^ 60 

Asp Gin Met Thr Leu Leu Gin Tyr Ser Trp Met Ser Leu Met Ala Phe 



75 



80 



Ala Leu Gly Trp Arg Ser Tyr Arg Gin Ser Ser Ala Asn Leu Leu Cys 
5 9 ° 95 

Phe Ala Pro Asp Leu He He Asn Glu Gin Arg Met Thr Leu Pro Cys 
Met Tyr Asp Gin Cys Lys His Met Leu Tyr Val Ser Ser Glu Leu Hi 



His 



125 

Arg Leu Gin Val Ser Tyr Glu Glu Tyr Leu Cys Met Lys Thr Leu Leu 

135 140 



Phe 
160 



Leu Leu Ser Ser Val Pro Lys Asp Gly Leu Lys Ser Gin Glu Leu 

Asp Glu lie Arg Met Thr Tyr He Lys Glu Leu Gly Lys Ala He Val 
165 175 

Lys Arg Glu Gly Asn Ser Ser Gin Asn Trp Gin Arg Phe Tyr Gin Leu 
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180 185 190 

Thr Lys Leu Leu Asp Ser Met His Glu Val Val Glu Asn Leu Leu Asn 
195 200 205 

Tyr Cys Phe Gin Thr Phe Leu Asp Lys Thr Met Ser He Glu Phe Pro 
210 215 220 

Glu Met Leu Ala Glu He He Thr Asn Gin He Pro Lys Tyr Ser Asn 
225 230 235 240 

Gly Asn He Lys Lys Leu Leu Phe His Gin Lys 
245 250 

<210> 32 

<211> 259 

<212> PRT 

<213> Homo sapiens 



<400> 32 

Gly Ser Val Pro Ala Thr Leu Pro Gin Leu Thr Pro Thr Leu Val Ser 
1 5 io 15 

Leu Leu Glu Val He Glu Pro Glu Val Leu Tyr Ala Gly Tyr Asp Ser 
20 25 30 

Ser Val Pro Asp Ser Thr Trp Arg He Met Thr Thr Leu Asn Met Leu 
35 40 45 

Gly Gly Arg Gin Val He Ala Ala Val Lys Trp Ala Lys Ala He Pro 
50 55 60 

Gly Phe Arg Asn Leu His Leu Asp Asp Gin Met Thr Leu Leu Gin Tyr 
65 70 75 80 

Ser Trp Met Ser Leu Met Ala Phe Ala Leu Gly Trp Arg Ser Tyr Arg 
85 90 " 95 

Gin Ser Ser Ala Asn Leu Leu Cys Phe Ala Pro Asp Leu He He Asn 
100 105 HO 

Glu Gin Arg Met Thr Leu Pro Cys Met Tyr Asp Gin Cys Lys His Met 
115 120 125 

Leu Tyr Val Ser Ser Glu Leu His Arg Leu Gin val Ser Tyr Glu Glu 
130 135 140 



Tyr Leu Cys Met Lys Thr Leu Leu Leu Leu Ser Ser Val Pro Lys Asp 

1^5 150 155 160 

Gly Leu Lys Ser Gin Glu Leu Phe Asp Glu He Arg Met Thr Tyr He 

165 170 175 

Lys Glu Leu Gly Lys Ala He Val Lys Arg Glu Gly Asn Ser Ser Gin 
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180 



185 



190 



Asn Trp Gin Ar 9 Phe Tyr Gln h Thr Lys Leu Leu ^ ^ ^ ^ 

200 205 

Glu Val v.l 61« Asn Leu Leu Asn Tyr Cys Phe Gin Thr Phe Leu Asp 

^15 220 

Lys Thr „et Ser He Glu Phe Pro Glu Met Leu Ala Glu He He Thr 



235 



240 



Asn Gin lie Pro Lys Tyr Ser Asn Gly Asn llM Lys Lys Leu Leu phe 
245 2 50 255 

His Gin Lys 



<210> 33 

<211> 257 

<212> PRT 

<213> Homo sapiens 

<400> 33 

Val Pro Ala Thr Leu Pro Gin Leu Thr Pro Thr Leu Val Ser Leu Leu 
5 10 15 

Glu Val lie Glu Pro Glu Val Leu Tyr Ala Gly Tyr Asp Ser Ser Val 



20 

30 



25 



Pro Asp ser Thr Arg Arg ll e Met Thr Thr Leu Asn Met Leu Gly Gly 

40 45 

Arg Gin v.l He Ala Ala Val Lys Trp Ala Lys Ala He Pro Gly Phe 



Arg Asn Leu His Leu Asp Asp Gin Met Thr Leu Leu Gin Tyr Ser Trp 
Met Phe Leu Met Ala Phe Ala Leu Gly Trp Arg Ser Tyr Arg Gin Ser 



90 



95 



Ser Ala Asn Leu Leu Cys Phe Ala Pro Asp Leu He lie Asn Glu Gin 

105 110 

Arg Met Thr Leu Pro Cys Met Tyr Asp Gin Cys Lys His Met Leu Tyr 
AAO 120 1 



125 



val Ser Ser Glu Leu His Arg Leu Gin Val Ser T yr Glu Glu Tyr Leu 

135 

Cys Met Lys Thr Leu Leu Leu Leu Ser Ser Val Pro Lys Asp Gly Leu 

155 160 
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Lys Ser Gin Glu Leu Phe Asp Glu lie Arg Met Thr Tyr lie Lys Glu 
165 170 175 

Leu Gly Lys Ala He Val Lys Arg Glu Gly Asn Ser Ser Gin Asn Trp 
180 185 190 

Gin Arg Phe Tyr Gin Leu Thr Lys Leu Leu Asp Ser Met His Glu Val 
195 200 205 

Val Glu Asn Leu Leu Asn Tyr Cys Phe Gin Thr Phe Leu Asp Lys Thr 
210 215 220 

Met Ser He Glu Phe Pro Glu Met Leu Ala Glu He He Thr Asn Gin 
225 230 235 240 

He Pro Lys Tyr Ser Asn Gly Asn He Lys Lys Leu Leu Phe His Gin 
245 250 255 

Lys 



<210> 34 

<211> 257 

<212> PRT 

<213> Homo sapiens 

<400> 34 

Val Pro Ala Thr Leu Pro Gin Leu Thr Pro Thr Leu Val Ser Leu Leu 
15 io 15 

Glu Val He Glu Pro Glu Val Leu Tyr Ala Gly Tyr Asp Ser Ser Val 
20 25 30 

Pro Asp Ser Thr Trp Arg He Met Thr Thr Leu Asn Met Leu Gly Gly 
35 40 45 

Arg Gin Val He Ala Ala Val Lys Trp Ala Lys Ala He Pro Gly Phe 
50 55 60 

Arg Asn Leu His Leu Asp Asp Gin Met Thr Leu Leu Gin Tyr Ser Trp 
65 70 75 80 

Met Phe Leu Met Ala Phe Ala Leu Gly Trp Arg Ser Tyr Arg Leu Ser 
85 90 95 

Ser Ala Asn Leu Leu Cys Phe Ala Pro Asp Leu He He Asn Glu Gin 
100 105 110 

Arg Met Thr Leu Pro Cys Met Tyr Asp Gin Cys Lys His Met Leu Tyr 
115 120 125 

Val Ser Ser Glu Leu His Arg Leu Gin Val Ser Tyr Glu Glu Tyr Leu 
130 135 140 
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Cys Met Lys Thr Leu Leu Leu Leu Ser w w a , r> 

145 150 4jeu ber Ser Va l Lys Asp Gly Leu 

155 160 

LV3 Ser Gin Glu Uu Phe Asp Glu Ile ^ Thr Tyr ^ ^ ^ 

170 175 

Leu Gly Lys ji. lie Val Lys Arg „ Gly Asn Ser Ser ^ ^ ^ 

185 19Q 

Gin Arg Phe Tyr Gin Leu Thr Lys Leu Leu Asp Ser Met 

200 2Q5 

val Glu Asn L cu Leu Asn Tyr Cys Phe Gin Thr Phe Leu Asp Lys Thr 



onn His Glu Val 

200 205 



215 



220 



Met ser He Glu Phe Pro Glu Met Leu Ala Glu lie He Thr Asn 



235 



Gin 
240 





Lys 



250 255 



<210> 35 

<211> 257 

<212> prt 

<213> Homo sapiens 

<400> 35 

val Pro Ala Thr Leu Pro Gin Leu Thr Pro Thr Leu Val Ser 
b 10 



Leu Leu 
15 



Glu Val He Glu Pro Glu Val Leu Tyr Ala Gly Tyr Asp Ser Ser Val 



25 



30 



Pro Asp ser Thr Tr P Arg He Met Thr Thr Leu Asn Met Leu Gly Gly 

Arg Gin Val He Ala Ala Val Lys Trp Ala Lys Ala lie Pro Gly Phe 

55 60 

Arg Asn Leu His Leu Asp Asp Gin Met Thr Leu Leu Gin Tyr Ser Trp 

75 80 

Met Phe Leu Met Ala Phe Ala Leu Gly Trp Arg Ser Tyr Arg His Ser 
85 90 95 



Ser Ala Asn Leu Leu Cys Phe Ala Pro Asp Leu lie He Asn Glu Gin 

105 110 

Arg Met Thr Leu Pro Cys Met Tyr Asp Gin cys Lys His Met Leu Tyr 
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val Ser Ser Glu Leu His Arg Leu Gin Val Ser Tyr Glu Glu Tyr Leu 
130 135 140 

Cys Met Lys Thr Leu Leu Leu Leu Ser Ser Val Pro Lys Asp Gly Leu 
145 150 155 160 

Lys Ser Gin Glu Leu Phe Asp Glu He Arg Met Thr Tyr He Lys Glu 
165 170 175 

Leu Gly Lys Ala He Val Lys Arg Glu Gly Asn Ser Ser Gin Asn Trp 
180 185 190 

Gin Arg Phe Tyr Gin Leu Thr Lys Leu Leu Asp Ser Met. His Glu Val 
195 200 20> 

Val Glu Asn Leu Leu Asn Tyr Cys Phe Gin Thr Phe Leu Asp Lys Thr 
210 215 220 

Met Ser He Glu Phe Pro Glu Met Leu Ala Glu He He Thr Asn Gin 
225 230 235 240 

He Pro Lys Tyr Ser Asn Gly Asn He Lys Lys Leu Leu Phe His Gin 
245 250 255 

Lys 



<210> 36 

<211> 257 

<212> PRT 

<213> Homo sapiens 

<400> 36 

Val Pro Ala Thr Leu Pro Gin Leu Thr Pro Thr Leu Val Ser Leu Leu 
1 5 10 15 

Glu Val He Glu Pro Glu Val Leu Tyr Ala Gly Tyr Asp Ser Ser Val 
20 25 30 

Pro Asp Ser Thr Trp Arg He Met Thr Thr Leu Asn Met Leu Gly Gly 
35 40 45 

Arg Gin Val He Ala Thr Val Lys Trp Ala Lys Ala He Pro Gly Phe 
50 55 60 

Arg Asn Leu His Leu Asp Asp Gin Met Thr Leu Leu Gin Tyr Ser Trp 
65 70 75 80 

Met Phe Leu Met Ala Phe Ala Leu Gly Trp Arg Ser Tyr Arg Gin Ser 
85 90 95 

Ser Ala Asn Leu Leu Cys Phe Ala Pro Asp Leu lie He Asn Glu Gin 
100 105 HO 
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Arg Met Thr Leu Pro Cys Met Tyr Asp Gin Cys Lys His Met Leu Tyr 

120 " "~ 3 ~" 



125 



Val Ser Ser Glu Leu His Arg Leu Gin Val Ser Tyr Glu Glu Tyr Leu 

135 ido 



Cys Met Lys Thr Leu Leu Leu Leu Ser Ser Val Pro Lys Asp Gly Leu 
150 155 160 

Lys Ser Gin Glu Leu Phe Asp Glu He Arg Met Thr Tyr He Lys Glu 
165 170 175 

Leu Gly Lys Ala He Val Lys Arg Glu Gly Asn Ser Ser Gin Asn Trp 

185 190 

Gin Arg Phe Tyr Gin Leu Thr Lys Leu Leu Asp Ser Met His Glu Val 
iys 200 



205 



Val Glu Asn Leu Leu Asn Tyr Cys Phe Gin Thr Phe Leu Asp Lys Thr 

215 220 

Met Ser He Glu Phe Pro Glu Met Leu Ala Glu He He Thr Asn Gin 
230 2 35 2 40 

He Pro Lys Tyr Ser Asn Gly Asn He Lys Lys Leu Leu Phe His Gin 
245 2 50 255 

Lys 



<210> 37 

<211> 257 

<212> PRT 

<213> Homo sapiens 

<400> 37 

Val Pro Ala Thr Leu Pro Gin Leu Thr Pro Thr Leu Val Ser Leu Leu 



10 



15 



Glu Val lie Glu Pro Glu Val Leu Tyr Ala Gly Tyr Asp Ser Ser Val 
20 25 30 

Pro Asp ser Thr Trp Arg He Met Thr Thr Leu Asn Met Leu Gly Gly 
Jb 40 45 

Arg Gin Val He Ala Ala Val Lys Trp Ala Lys Ala lie Pro Gly Phe 

55 60 

Arg Asn Leu His Leu Asp Asp Gin Met Thr Leu Leu Gin Tyr Ser Trp 
0 ? 5 80 

Met Phe Leu Met Ala Phe Ala Leu Gly Trp Arg Ser Tyr Arg Gin Ser 
85 90 95 
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Ser Ala Asn Met Leu Cys Phe Ala Pro Asp Leu lie He Asn Glu Gin 
100 105 no 

Arg Met Thr Leu Pro Cys Met Tyr Asp Gin Cys Lys His Met Leu Tyr 
115 120 125 

Val Ser Ser Glu Leu His Arg Leu Gin Val Ser Tyr Glu Glu Tyr Leu 
130 135 140 

Cys Met Lys Thr Leu Leu Leu Leu Ser Ser Val Pro Lys Asp Gly Leu 
14 5 150 155 160 

Lys Ser Gin Glu Leu Phe Asp Glu He Arg Met Thr Tyr He Lys Glu 
165 170 175 

Leu Gly Lys Ala He Val Lys Arg Glu Gly Asn Ser Ser Gin Asn Trp 
180 185 190 

Gin Arg Phe Tyr Gin Leu Thr Lys Leu Leu Asp Ser Met His Glu Val 
195 200 205 

Val Glu Asn Leu Leu Asn Tyr Cys Phe Gin Thr Phe Leu Asp Lys Thr 
210 215 220 

Met Ser He Glu Phe Pro Glu Met Leu Ala Glu He He Thr Asn Gin 
225 230 235 240 

He Pro Lys Tyr Ser Asn Gly Asn lie Lys Lys Leu Leu Phe His Gin 
245 250 255 

Lys 



<210> 38 

<211> 257 

<212> PRT 

<213> Homo sapiens 

<400> 38 

Val Pro Ala Thr Leu Pro Gin Leu Thr Pro Thr Leu Val Ser Leu Leu 
1 5 10 15 

Glu Val He Glu Pro Glu Val Leu Tyr Ala Gly Tyr Asp Ser Ser Val 
20 25 ' * 30 

Pro Asp ser Thr Trp Arg He Met Thr Thr Leu Asn Met Leu Gly Gly 
35 40 45 

Arg Gin Val He Ala Ala Val Lys Trp Ala Lys Thr He Pro Gly Phe 
50 55 60 

Arg Asn Leu His Leu Asp Asp Gin Met Thr Leu Leu Gin Tyr Ser Trp 
65 70 75 80 
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Met Leu Leu Met Ala Phe Ala. Leu Gly Trp Arg Ser Tyr Arg Gin 
85 90 95 



Ser 



Ser Ala Asn Leu Leu Cys Phe Ala Pro Asp Leu lie Iie Asn Glu Gin 

105 110 

Arg Met Thr Leu Pro Cys Met Tyr Asp Gin Cys Lys His Met Leu Tyr 

120 125 



Val Ser Ser Glu Leu His Arg Leu Gin Val Ser Tyr Glu Glu Tyr Leu 

135 

Cys Met Lys Thr Leu Leu Leu Leu Ser Ser Val Pro Lys Asp Gly Leu 

150 155 160 

Lys Ser Gin Glu Leu Phe Asp Glu He Arg Met Thr Tyr He Lys Glu 
165 175 

Leu Gly Lys Ala He Val Lys Arg Glu Gly Asn Ser Ser Gin Asn Trp 

185 190 

Gin Arg Phe Tyr Gin Leu Thr Lys Leu Leu Asp Ser Met His Glu Val 

200 205 

Val Glu Asn Leu Leu Asn Tyr Cys Phe Gin Thr Phe Leu Asp Lys Thr 



220 



Met ser He Glu Phe Pro Glu Met Leu Ala Glu He He Thr Asn Gin 
230 235 240 

He Pro Lys Tyr Ser Asn Gly Asn He Lys Lys Leu Leu Phe His Gin 



245 250 255 



Lys 



<210> 39 

<211> 257 

<212> PRT 

<213> Homo sapiens 

<400> 39 

Val Pro Ala Thr Leu Pro Gin Leu Thr Pro Thr Leu Val Ser Leu Leu 
5 10 15 

Glu Val He Glu Pro Glu Val Leu Tyr Ala Gly Tyr Asp Ser Ser Val 

25 30 

Pro Asp ser Thr Trp Arg lie Met Thr Thr Phe Asn Met Leu Gly Gly 



45 



Arg Gin Val He Ala Ala Val Lys Trp Ala Lys Ala He Pro Cys P he 

55 60 
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Arg Asn Leu His Leu Asp Asp Gin Met Thr Leu Leu Gin Tyr Ser Trp 
65 70 75 80 

Met Phe Leu Met Ala Phe Ala Leu Gly Trp Arg Ser Tyr Arg Gin Ser 
8 5 90 ^ 95 

Ser Ala Asn Leu Leu Cys Phe Ala Pro Asp Leu lie lie Asn Glu Gin 
100 105 110 

Arg Met Thr Leu Pro Cys Met Tyr Asp Gin Cys Lys His Met Leu Tyr 
115 120 125 

Val Ser Ser Glu Leu His Arg Leu Gin Val Ser Tyr Glu Glu Tyr Leu 
130 135 140 

Cys Met Lys Thr Leu Leu Leu Leu Ser Ser Val Pro Lys Asp Gly Leu 
14 5 150 155 ' 160 

Lys Ser Gin Glu Leu Phe Asp Glu lie Arg Met Thr Tyr lie Lys Glu 
165 170 175 

Leu Gly Lys Ala He Val Lys Arg Glu Gly Asn Ser Ser Gin Asn Trp 
180 185 190 

Gin Arg Phe Tyr Gin Leu Thr Lys Leu Leu Asp Ser Met His Glu Val 
. 195 200 205 

Val Glu Asn Leu Leu Asn Tyr Cys Phe Gin Thr Phe Leu Asp Lys Thr 
210 215 220 

Met Ser He Glu Phe Pro Glu Met Leu Ala Glu He He Thr Asn Gin 
225 230 235 240 

He Pro Lys Tyr Ser Asn Gly Asn He Lys Lys Leu Leu Phe His Gin 
245 250 255 

Lys 



<210> 40 

<211> 257 

<212> PRT 

<213> Homo sapiens 

<400> 40 

Val Pro Ala Thr Leu Pro Gin Leu Thr Pro Thr Leu Val Ser Leu Leu 
1 5 10 15 

Glu Val He Glu Pro Glu Val Leu Tyr Ala Gly Tyr Asp Ser Ser Val 
20 25 30 

Pro Asp Ser Thr Trp Arg He Met Thr Thr Leu Asn Met Leu Gly Gly 
35 40 45 
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Arg Gin Val He Ala Ala val Lys Trp Ala Lys Ala He Pro Gly Phe 

55 60 

Arg Asn Leu His Leu Asp Asp Gin Met Thr Leu Leu Gin Tyr Ser Trp 

7 * 80 

Met Phe Leu Met Ala Phe Ala Leu Gly Trp Arg Ser Tyr Arg Gin Ser 



90 



95 



Ser Ala Asn Leu Leu Cys Phe Ala Pro Asp Leu He lie Asn Glu Gin 

110 



100 105 



Arg Met Thr Leu Pro Cys Met Tyr Asp Gin Cys Lys His Met Leu Tyr 
Val Ser Ser Glu Leu His Arg Leu Gin Val Ser Tyr Glu Glu Tyr His 



140 



Cys Met Lys Thr Leu Leu Leu Leu Ser Ser Val Pro Lys Asp Gly Leu 
50 160 

Lys Ser Gin Glu Leu Phe Asp Glu lie Arg Met Thr Tyr He Lys Glu 
165 "0 XV5 

Leu Gly Lys Ala He Val Lys Arg Glu Gly Asn Ser Ser Gin Asn Trp 

185 190 

Gin Arg Phe Tyr Gin Leu Thr Lys Leu Leu Asp Ser Met His Glu Val 
195 200 205 

Val Glu Asn Leu Leu Asn Tyr Cys Phe Gin Thr Phe Leu Asp Lys Thr 

215 220 

Met ser He Glu Phe Pro Glu Thr Leu Ala Glu He He Thr Asn Gin 

235 240 



He Pro Lys Tyr Ser Asn Gly Asn He Lys Lys Leu Leu Phe His Gin 
245 250 255 

Lys 



<210> 41 

<211> 257 

<212> prt 

<213> Homo sapiens 

<400> 41 



Val Pro Ala Thr Leu Pro Gin Leu Thr Pro Thr Leu Val Ser Leu Leu 
5 10 15 



Glu Val He Glu Pro Glu Val Leu Tyr Ala Gly Tyr Asp Ser 



25 30 



Ser Val 
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Pro Asp Ser Thr Trp Arg He Met Thr Thr Phe Asn Met Leu Gly Gly 
35 40 45 



Arg Gin Val He Ala Ala Val Lys Trp Ala Lys Ala He Pro Gly Phe 
50 55 60 

Arg Asn Leu His Leu Asp Asp Gin Met Thr Leu Leu Gin Tyr Ser Trp 
65 70 75 80 

Met Phe Leu Met Ala Phe Ala Leu Gly Trp Arg Ser Tyr Arg Gin Ser 
85 90 "* 95 

Ser Ala Asn Leu Leu Cys Phe Ala Pro Asp Leu He He Asn Glu Gin 
100 105 110 

Arg Met Thr Leu Pro Cys Met Tyr Asp Gin Cys Lys His Met Leu Tyr 
115 120 125 

Val Ser Ser Glu Leu His Arg Leu Gin Val Ser Tyr Glu Glu Tyr Leu 
130 135 140 

Cys Met Lys Thr Leu Leu Leu Leu Ser Ser Val Pro Lys Asp Gly Leu 
145 150 15 5 160 

Lys Ser Gin Glu Leu Phe Asp Glu He Arg Met Thr Tyr lie Lys Glu 
165 170 175 

Leu Gly Lys Ala He Val Lys Arg Glu Gly Asn Ser Ser Gin Asn Trp 
180 185 190 

Gin Arg Phe Tyr Gin Leu Thr Lys Leu Leu Asp Ser Met His Glu Val 
195 200 205 

Val Glu Asn Leu Leu Asn Tyr Cys Phe Gin Thr Phe Leu Asp Lys Asn 
210 215 220 

Met Ser He Glu Phe Pro Glu Met Leu Ala Glu He He Thr Asn Gin 
225 230 235 240 

He Pro Lys Tyr Ser Asn Gly Asn He Lys Lys Leu Leu Phe His Gin 
245 250 255 



Lys 
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